I am reading a file with : POIFSFileSystem fs = new POIFSFileSystem(is); workbook = new HSSFWorkbook(fs); and it generates a InvocationTargetException caused by an exception " java.lang.ArrayIndexOutOfBoundsException: 33". I tried to locate the problem in my excel file deleting progressively the content up to a blank file that still generates the problem. I looked for patches without success.
Created attachment 15546 [details] "Reduced" file that still generates the error.
I confirmed the bug with latest on CVS. Here is relevant part of the stack trace for reference: <snipped/> java.lang.ArrayIndexOutOfBoundsException: 33 at org.apache.poi.util.LittleEndian.getNumber(LittleEndian.java:491) at org.apache.poi.util.LittleEndian.getShort(LittleEndian.java:52) at org.apache.poi.hssf.record.ObjRecord.fillFields(ObjRecord.java:99) at org.apache.poi.hssf.record.Record.fillFields(Record.java:90) at org.apache.poi.hssf.record.Record.<init>(Record.java:55) at org.apache.poi.hssf.record.ObjRecord.<init>(ObjRecord.java:61) <snipped/> Apparently, org.apache.poi.hssf.record.ObjRecord.fillFields runs out of data in the loop: while (pos - offset < size) { short subRecordSid = LittleEndian.getShort(data, pos); short subRecordSize = LittleEndian.getShort(data, pos + 2); Record subRecord = SubRecord.createSubRecord(subRecordSid, subRecordSize, data, pos + 4); subrecords.add(subRecord); pos += subRecord.getRecordSize(); } The following fix seemed to work, although the underlying cause is still not known: file: org/apache/poi/hssf/record/ObjRecord.java // size-4: since at least the first two shorts have to be read while (pos - offset < size-4) I was able to read and save the file as a new xls with the above change and the new file opened up fine.
Amol, Can you pls check if your check also solves bug 34575. I think they are the same. I can figure.. does your patch optionally read in the bytes if they exist? coz i dont think the bug is prevalant in all files with OBJ record. If that is the case, I think this can go in (..with some comments in code :)
*** I thought i had resolved this issue as fixed *** *** couple of days back, but my fix comments dont *** *** appear, so I'm adding these comments again *** Further "investigation" revealed that the byte array was falling short 2 bytes when the sid of the SubRecord indicated an EndSubRecord. Hence, I made a slight modification in the change I proposed earlier in the file I committed. Here is the changed part: <code> while (pos - offset <= size-2) // atleast one "short" must be present { short subRecordSid = LittleEndian.getShort(data, pos); short subRecordSize = -1; // set default to "< 0" if (pos-offset <= size-4) { // see if size info is present, else default to -1 subRecordSize = LittleEndian.getShort(data, pos + 2); } </code> Now, when the byte array falls short two bytes when dealing with EndSubRecord, the length is implicitly set to 0 since the change in ObjRecord now sets the length to default value of -1 if the short value for the SubRecord size is not found.
*** Bug 34575 has been marked as a duplicate of this bug. ***