Bug 42652

Summary: HSSF cannot read excel file, Record size problems
Product: POI Reporter: Rainer Schwarze <rsc>
Component: HSSFAssignee: POI Developers List <dev>
Status: RESOLVED WONTFIX    
Severity: normal    
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Hardware: Other   
OS: other   
Attachments: Excel file with unexpected record sizes
Test case

Description Rainer Schwarze 2007-06-13 04:49:08 UTC
The attached Excel file has several problems related to HSSF. First it is not in
OLE2-format, but that can be solved by wrapping it with a POIFSFileSystem.
Second more relevant problem is, several records in the Excel file are shorter
than HSSF expects them. As of my understanding, the excel file is somewhat
"non-standard". (Maybe the issues with this file are related to bug #42564.)

Attached are the Excel file and test code to show the problem. 

I posted a message with more details about what I found out so far to the
mailing list. If that should be included/attached here, please say so.

Best wishes,
Rainer Schwarze
Comment 1 Rainer Schwarze 2007-06-13 04:51:52 UTC
Created attachment 20338 [details]
Excel file with unexpected record sizes

This excel file is not in OLE2-format, to read it with HSSF one needs to wrap
it inside a POIFSFileSystem.
Comment 2 Rainer Schwarze 2007-06-13 04:52:39 UTC
Created attachment 20339 [details]
Test case
Comment 3 Josh Micich 2008-05-10 18:05:44 UTC
Tried the example+test file in POI 3.1-beta1.  First crash is in DimensionsRecord, where POI expects to read 14 bytes but only gets 10. That's a strong hint that the actual document is really BIFF3-BIFF5 format.  You suggested that several records are shorter than expected, which tends to support this conclusion.

Setting the workbook stream name to "Workbook" (which would indicate BIFF8) is not enough.  POI can only read spreadsheets that *fully* meet the BIFF8 spec.

(No, bug 42564 was unrelated.  It was predominantly about ArrayPtg encoding issues)