Summary: | [Regression in 3.0.2] Unable to read an Excel file | ||
---|---|---|---|
Product: | POI | Reporter: | Laurent Poublan <lpoublan> |
Component: | HPSF | Assignee: | POI Developers List <dev> |
Status: | RESOLVED FIXED | ||
Severity: | regression | ||
Priority: | P2 | ||
Version: | unspecified | ||
Target Milestone: | --- | ||
Hardware: | All | ||
OS: | All | ||
Attachments: | xls file not readable with POI HSSF 3.0.2 (ok with 3.0.1) |
Description
Laurent Poublan
2008-02-07 08:46:06 UTC
Created attachment 21493 [details]
xls file not readable with POI HSSF 3.0.2 (ok with 3.0.1)
To reproduce, simply try:
POIFSFileSystem fs=new POIFSFileSystem(new
FileInputStream("C:/temp/test.xls"));
new HSSFWorkbook(fs); // this line throws a StringIndexOutOfBoundsException
Hmm, no changes to org.apache.poi.hpsf.Property have been made since 2006, so it's not anything obvious there I don't know if your document has a corrupt SummaryInformation stream, or if there's a bug in the SummaryInformation stream parsing. I've added a disabled failing testcase for it to svn trunk, which can be a start for someone to take a look at why the SummaryInformation isn't working. (3.0.1 didn't do document metadata by default, but 3.0.2 does) It seems that the method: org.apache.poi.hpsf.Property.readDictionary(byte[], long, int, int) is not exercised by any of the existing junits. When comparing the execution flow of this bug with the successful test cases, divergence can be seen at line 151 of the constructor - org.apache.poi.hpsf.Property.Property(long, byte[], long, int, int) For the sample spreadsheet, the Property constructor is invoked successfully 19 times before this.id==0 and readDictionary() gets invoked. The properties are broken. Neither the Windows XP Explorer nor Excel are able to show them. But at least they don't fail. I am going to implement the same behaviour in HPSF. Fixed with revision 619765. HPSF now copes with a broken dictionary in Document Summary Information streams. RuntimeExceptions that occured when trying to read bogus data are now caught. Dictionary entries up to but not including the bogus one are preserved, the rest is ignored. |