Bug 46269

Summary: BIFF2 XLS file not reading. "Invalid header signature"
Product: POI Reporter: Syam Pillai <syam>
Component: HSSFAssignee: POI Developers List <dev>
Severity: normal CC: obidin
Priority: P2    
Version: 3.5-dev   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Attachments: The problematic file.
zip two java files

Description Syam Pillai 2008-11-23 00:30:45 UTC
Created attachment 22916 [details]
The problematic file.

Java command executed:

java org.apache.poi.poifs.filesystem.POIFSFileSystem test.xls out.xls

While reading a Excel file, I'm getting the following exception:

Exception in thread "main" java.io.IOException: Invalid header signature; read 4503608217567241, expected -2226271756974174256
        at org.apache.poi.poifs.storage.HeaderBlockReader.<init>(HeaderBlockReader.java:112)
        at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:151)
        at org.apache.poi.poifs.filesystem.POIFSFileSystem.main(POIFSFileSystem.java:415)

The test.xls file is attached. The file was received from a Govt organization and there is no way to verify which version of M$ Excel they use. The file opens fine in Excel and openOffice.
Comment 1 Josh Micich 2008-11-28 19:18:19 UTC
The attached file is a BIFF2 file.  The only BIFF version POI supports is BIFF8.

When you open and re-save with Excel or OO, the file is silently converted to BIFF8.

Improved error message added in svn r721620.  I am marking this bug as 'WONTFIX' because it would be difficult to extend POI to handle previous BIFF versions.
Comment 2 Josh Micich 2008-11-29 00:48:30 UTC
Created attachment 22963 [details]
zip two java files

I took a look at the example file (Attachment id=22916) and saw that only 5 BIFF2 records types were present.  It was relatively easy to write a BIFF2 stream reader that would handle just these records.  This might be a viable solution path if your input stays relatively simple.  Here is some sample code to call the attached converter:

InputStream is = new FileInputStream("ex46269-22916.xls"); 
HSSFWorkbook wb = BIFF2To8Converter.convert(is, "Sheet1");