HSSF / XSSF Interaction with Microsoft Excel ( .xls and .xlsx )
Overview
We will discuss HSSF / XSSF Interaction with Microsoft Excel ( .xls and .xlsx ). Mostly, interacting with excel documents in automation frameworks built with the open source we use either Apache POI or JExcel ( a.k.a JXL API ). In this article, we will concentrate on Apache POI.
Test data in Excel files might be in .xls or .xlsx format. We need to keep some things in mind before interacting with these formats. Depending on the file format we need to change the API being used. Below are some of the key points.
The wrong usage of the API would end with exceptions in your code.
Sample exception below:
org.apache.poi.poifs.filesystem.OfficeXmlFileException: The supplied data appears to be in the Office 2007+ XML. You are calling the part of POI that deals with OLE2 Office Documents. You need to call a different part of POI to process this data (eg XSSF instead of HSSF) Â Â Â at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:131) Â Â Â at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:104) Â Â Â at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:128) Â Â Â at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:342) Â Â Â at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:323)
The Difference:
HSSF is the POI’s pure Java implementation of the Excel ’97(-2007) file format. XSSF is the POI’s pure Java implementation of the Excel 2007+ OOXML file format.( i.e .xlsx format )
Both provide ways to read, create, modify, read and write excel spreadsheets. The below picture depicts the API usage for the file format:
1. An event model API for efficient read-only access
2.Usermodel API for creating, reading and modifying xls files
If you’re merely reading spreadsheets, then use the eventmodel API in either the org.apache.poi.hssf.eventusermodel package, or the org.apache.poi.xssf.eventusermodel package, depending on your file format. Also, if you’re modifying spreadsheets then use the usermodel API. Furthermore, you can also generate spreadsheets by this way.