Bug 43408

Summary: org.apache.poi.hssf.record.RecordFormatException at BOF record
Product: POI Reporter: Tomas Vehovsky <easy>
Component: HSSFAssignee: POI Developers List <dev>
Status: RESOLVED DUPLICATE    
Severity: normal    
Priority: P2    
Version: 3.0-FINAL   
Target Milestone: ---   
Hardware: Other   
OS: Windows 2000   

Description Tomas Vehovsky 2007-09-17 08:18:51 UTC
My Java code
  ...
  fs = new POIFSFileSystem( new FileInputStream( sPath ) );
  m_workbook = new HSSFWorkbook(fs);

throws exception while creating new HSSFWorkbook instance:

org.apache.poi.hssf.record.RecordFormatException: Unable to construct record
instance
	at org.apache.poi.hssf.record.RecordFactory.createRecord(RecordFactory.java:191)
	at org.apache.poi.hssf.record.RecordFactory.createRecords(RecordFactory.java:115)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:205)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:153)
	at ...
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
	at java.lang.reflect.Constructor.newInstance(Unknown Source)
	at org.apache.poi.hssf.record.RecordFactory.createRecord(RecordFactory.java:179)
	... 12 more
Caused by: java.lang.ArrayIndexOutOfBoundsException
	at
org.apache.poi.hssf.record.RecordInputStream.checkRecordPosition(RecordInputStream.java:132)
	at org.apache.poi.hssf.record.RecordInputStream.readInt(RecordInputStream.java:155)
	at org.apache.poi.hssf.record.BOFRecord.fillFields(BOFRecord.java:118)
	at org.apache.poi.hssf.record.Record.<init>(Record.java:56)
	at org.apache.poi.hssf.record.BOFRecord.<init>(BOFRecord.java:99)
	... 17 more

Exception is raised within org.apache.poi.hssf.record.BOFRecord.fillFields() at
these statements:
...
field_5_history  = in.readInt();
field_6_rversion = in.readInt();

I cannot attach example of excelsheet producing this error since it contains my
customer's private data. However, it seems the XLS format is quite old because
BOFRecord member "field_4_year" is "1995".

I guess that this XLS file doesn't have fields "history" and "rversion" in the
BOFRecord.

Changing the Java code to
try {
  field_5_history  = in.readInt();
  field_6_rversion = in.readInt();
}
catch (ArrayIndexOutOfBoundsException aioobe ) {} 

helped me to get rid of the problem and my XLS file was opened well afterwards.
Please, investigate possibility to add this fix to official POI package.
Comment 1 Nick Burch 2007-09-17 09:50:32 UTC
Do you happen to know what version of excel (and even if it was excel) produced
the file?

Before adding in patches to skip records, it'd be good to know how common it's
likely to be. Also, we normally like to add unit tests, so ideally we'll want an
example of a problem file. Any chance you could edit out the sensitive data, and
upload?
Comment 2 Tomas Vehovsky 2007-09-20 08:00:10 UTC
The file was produced by other external tool (not by Excel itself).
Unfortunately, I cannot cut-out sensitive data because once I open and save the
document in MS Excel then the BOF record inside the document is changed (fixed)
and document can be opened by POI.
BOF record of the problematic document looks like this:
Record identifier: 0809h ==> BIFF5-8
Offset (Length) Value: 0 (2) 0600h ==> BIFF8
Offset (Length) Value: 2 (2) 0005h ==> Workbook globals
Offset (Length) Value: 4 (2) 09D7h
Offset (Length) Value: 6 (2) 07CBh

It's obvious that this is non-Excel record but record written by external tool.
See for example Excel File Format description
http://sc.openoffice.org/excelfileformat.pdf
(chapters 5.8.1 and 5.8.2). If POI is supposed to accept not only Excel XLS
files but XLS files written by external tools as well, then it should be able to
allow (but ignore) optional fields in BOF record.
Some kind of solution for non standard BOF record is mentioned here
http://mail-archives.apache.org/mod_mbox/poi-dev/200706.mbox/%3C466DA1B3.5050006@admadic.de%3E

...
protected void fillFields(RecordInputStream in)
    {
        field_1_version  = in.readShort();
        field_2_type     = in.readShort();
        if (in.getRecordOffset()<in.getLength())
           field_3_build    = in.readShort();
        if (in.getRecordOffset()<in.getLength())
           field_4_year     = in.readShort();
        if (in.getRecordOffset()<in.getLength())
           field_5_history  = in.readInt();
        if (in.getRecordOffset()<in.getLength())
           field_6_rversion = in.readInt();
    }
...

Comment 3 Nick Burch 2007-11-12 13:53:10 UTC

*** This bug has been marked as a duplicate of 42794 ***