Bug 35527

Summary:	ArrayIndexOutOfBoundsException when reading xls file
Product:	POI	Reporter:	Olivier <OMaffeis>
Component:	HSSF	Assignee:	POI Developers List <dev>
Status:	RESOLVED FIXED
Severity:	major	CC:	avik
Priority:	P2
Version:	2.5-FINAL
Target Milestone:	---
Hardware:	PC
OS:	Windows XP
Attachments:	"Reduced" file that still generates the error.

Description Olivier 2005-06-28 22:32:37 UTC

I am reading a file with :

	POIFSFileSystem fs = new POIFSFileSystem(is);
	workbook = new HSSFWorkbook(fs);

and it generates a InvocationTargetException caused by an exception "
java.lang.ArrayIndexOutOfBoundsException: 33".
I tried to locate the problem in my excel file deleting progressively the
content up to a blank file that still generates the problem.
I looked for patches without success.

Comment 1 Olivier 2005-06-28 22:34:41 UTC

Created attachment 15546 [details]
"Reduced" file that still generates the error.

Comment 2 Amol Deshmukh 2005-06-28 23:23:48 UTC

I confirmed the bug with latest on CVS. Here is relevant part of the stack trace
for reference:

<snipped/>
java.lang.ArrayIndexOutOfBoundsException: 33
	at org.apache.poi.util.LittleEndian.getNumber(LittleEndian.java:491)
	at org.apache.poi.util.LittleEndian.getShort(LittleEndian.java:52)
	at org.apache.poi.hssf.record.ObjRecord.fillFields(ObjRecord.java:99)
	at org.apache.poi.hssf.record.Record.fillFields(Record.java:90)
	at org.apache.poi.hssf.record.Record.<init>(Record.java:55)
	at org.apache.poi.hssf.record.ObjRecord.<init>(ObjRecord.java:61)
<snipped/>

Apparently, org.apache.poi.hssf.record.ObjRecord.fillFields runs out of data in
the loop:

        while (pos - offset < size)
        {
            short subRecordSid = LittleEndian.getShort(data, pos);
            short subRecordSize = LittleEndian.getShort(data, pos + 2);
            Record subRecord = SubRecord.createSubRecord(subRecordSid,
subRecordSize, data, pos + 4);
            subrecords.add(subRecord);
            pos += subRecord.getRecordSize();
        }

The following fix seemed to work, although the underlying cause is still not known:

file: org/apache/poi/hssf/record/ObjRecord.java

  // size-4: since at least the first two shorts have to be read
  while (pos - offset < size-4) 

I was able to read and save the file as a new xls with the above change and the
new file opened up fine.

Comment 3 Avik Sengupta 2005-07-05 14:33:58 UTC

Amol, 

Can you pls check if your check also solves bug 34575. I think they are the same. 

I can figure.. does your patch optionally read in the bytes if they exist? coz i
dont think the bug is prevalant in all files with OBJ record. If that is the
case, I think this can go in (..with some comments in code :)

Comment 4 Amol Deshmukh 2005-07-08 16:59:12 UTC

*** I thought i had resolved this issue as fixed  *** 
*** couple of days back, but my fix comments dont *** 
*** appear, so I'm adding these comments again    ***


Further "investigation" revealed that the byte array
was falling short 2 bytes when the sid of the SubRecord
indicated an EndSubRecord.

Hence,
I made a slight modification in the change I proposed
earlier in the file I committed. Here is the changed part:

<code>

while (pos - offset <= size-2) // atleast one "short" must be present
{
  short subRecordSid = LittleEndian.getShort(data, pos);
  short subRecordSize = -1; // set default to "< 0"
  if (pos-offset <= size-4) { // see if size info is present, else default to -1
    subRecordSize = LittleEndian.getShort(data, pos + 2);
  }

</code>



Now, when the byte array falls short two bytes when dealing with
EndSubRecord, the length is implicitly set to 0 since the change
in ObjRecord now sets the length to default value of -1 if the 
short value for the SubRecord size is not found.

Comment 5 Amol Deshmukh 2005-07-08 17:02:30 UTC

*** Bug 34575 has been marked as a duplicate of this bug. ***