Bug 48261

Summary: Can't open an Excel 97 file with POI
Product: POI Reporter: pgervaise
Component: HSSFAssignee: POI Developers List <dev>
Status: RESOLVED WONTFIX    
Severity: critical    
Priority: P2    
Version: 3.5-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: Windows XP   
Attachments: The excel file that cannot be opened with POI

Description pgervaise 2009-11-23 03:40:24 UTC
Created attachment 24586 [details]
The excel file that cannot be opened with POI

With that code :

FileInputStream in = new FileInputStream("not_work.xls");
HSSFWorkbook w = new HSSFWorkbook(in);

It generate an exception :

Warning, incorrectly terminated empty data blocks in POIFS block listing (should end at -2, ended at 0)
Warning, incorrectly terminated empty data blocks in POIFS block listing (should end at -2, ended at 0)
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
	at org.apache.poi.poifs.storage.DocumentBlock.getDataInputBlock(DocumentBlock.java:162)
	at org.apache.poi.poifs.filesystem.POIFSDocument.getDataInputBlock(POIFSDocument.java:253)
	at org.apache.poi.poifs.filesystem.DocumentInputStream.getDataInputBlock(DocumentInputStream.java:117)
	at org.apache.poi.poifs.filesystem.DocumentInputStream.<init>(DocumentInputStream.java:75)
	at org.apache.poi.poifs.filesystem.DirectoryNode.createDocumentInputStream(DirectoryNode.java:131)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:273)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:200)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:316)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:297)
	at Main.main(Main.java:10)


For Excel 97 (and Excel 2007) I can open the file and with all data.

Note : I tested with POI 2.0, POI 3.0-RC4, POI 3.1 and POI 3.2 and the result is :

Exception in thread "main" java.io.IOException: block[ 0 ] already removed
	at org.apache.poi.poifs.storage.BlockListImpl.remove(BlockListImpl.java:97)
	at org.apache.poi.poifs.storage.BlockAllocationTableReader.fetchBlocks(BlockAllocationTableReader.java:190)
	at org.apache.poi.poifs.storage.BlockListImpl.fetchBlocks(BlockListImpl.java:130)
	at org.apache.poi.poifs.storage.SmallBlockTableReader.getSmallDocumentBlocks(SmallBlockTableReader.java:61)
	at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:176)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:312)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:293)
	at Main.main(Main.java:12)
Comment 1 pgervaise 2009-11-23 03:44:48 UTC
Doesn't know how the file is generated. It came for a partner by email.

It seems that the bug is similar for bug n°32076.
Comment 2 Baris Ulucinar 2009-12-07 00:22:27 UTC
I Have the same error, with the same version of Apache POI: 3.5-FINAL, it's a patch available ? 

my error log: 

java.lang.ArrayIndexOutOfBoundsException: 0
	at org.apache.poi.poifs.storage.DocumentBlock.getDataInputBlock(DocumentBlock.java:163)
	at org.apache.poi.poifs.filesystem.POIFSDocument.getDataInputBlock(POIFSDocument.java:253)
	at org.apache.poi.poifs.filesystem.DocumentInputStream.getDataInputBlock(DocumentInputStream.java:117)
	at org.apache.poi.poifs.filesystem.DocumentInputStream.<init>(DocumentInputStream.java:75)
	at org.apache.poi.poifs.filesystem.DirectoryNode.createDocumentInputStream(DirectoryNode.java:131)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:273)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:200)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:316)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:297)
	at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:60)
Comment 3 Baris Ulucinar 2009-12-07 00:22:27 UTC
I Have the same error, with the same version of Apache POI: 3.5-FINAL, it's a patch available ? 

my error log: 

java.lang.ArrayIndexOutOfBoundsException: 0
	at org.apache.poi.poifs.storage.DocumentBlock.getDataInputBlock(DocumentBlock.java:163)
	at org.apache.poi.poifs.filesystem.POIFSDocument.getDataInputBlock(POIFSDocument.java:253)
	at org.apache.poi.poifs.filesystem.DocumentInputStream.getDataInputBlock(DocumentInputStream.java:117)
	at org.apache.poi.poifs.filesystem.DocumentInputStream.<init>(DocumentInputStream.java:75)
	at org.apache.poi.poifs.filesystem.DirectoryNode.createDocumentInputStream(DirectoryNode.java:131)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:273)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:200)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:316)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:297)
	at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:60)
Comment 4 Nick Burch 2010-06-03 12:50:47 UTC
As best as I can tell, the file is at least partly corrupted. The POIFS block listing is wrong for starters (hence the warnings)

I strongly suspect that the file was generated by something other than Excel, and the program in question doesn't properly follow the spec

I'd suggest you either re-save the files using Excel to fix them, or speak to whoever writes the software that generates them and ask them to re-read the spec and fix their output
Comment 5 Nick Burch 2011-05-23 19:43:18 UTC
In case anyone comes across this, with 3.8 the new NPOIFS is more tolerant of this class of faulty file, and can open the attached sample excel file. POIFS can also open more files in this class that it couldn't in 3.6, but the real fix does appear to be with the program that produces the file.