Bug 46987

Summary: java.lang.RuntimeException: Buffer underrun - requested 512 bytes but X was available
Product: POI Reporter: Trejkaz (pen name) <trejkaz>
Component: HSSFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: regression    
Priority: P2    
Version: 3.5-dev   
Target Milestone: ---   
Hardware: PC   
OS: Windows Vista   
Attachments: test file

Description Trejkaz (pen name) 2009-04-07 17:14:54 UTC
I get the following exception loading a workbook:

java.lang.RuntimeException: Buffer underrun - requested 512 bytes but 192 was available
	at org.apache.poi.poifs.filesystem.DocumentInputStream.checkAvaliable(DocumentInputStream.java:202)
	at org.apache.poi.poifs.filesystem.DocumentInputStream.readFully(DocumentInputStream.java:224)
	at org.apache.poi.hssf.record.RecordInputStream.readFully(RecordInputStream.java:251)
	at org.apache.poi.hssf.record.RecordInputStream.readFully(RecordInputStream.java:246)
	at org.apache.poi.hssf.record.RecordInputStream.readRemainder(RecordInputStream.java:341)
	at org.apache.poi.hssf.record.UnknownRecord.<init>(UnknownRecord.java:73)
	at org.apache.poi.hssf.record.RecordFactory.createSingleRecord(RecordFactory.java:251)
	at org.apache.poi.hssf.record.RecordFactory.createRecords(RecordFactory.java:373)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:277)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:202)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:184)

Whereas it is true that document streams should be multiples of 512 bytes, other parts of POI work around this in order to tolerate files written by flakier past versions of Excel.

This was not a problem in POI 3.1.
Comment 1 Josh Micich 2009-04-08 09:03:17 UTC
DocumentInputStream can contain any length of data.  If you put a breakpoint at 
DocumentInputStream.<init>(DocumentEntry.java:73) and run the unit tests, you will see that very few of the POI test case examples have an exact multiple of 512 bytes.

The stack-trace indicates that at 196 bytes before the ends of stream, an unknown record sid was read, followed by an apparent size of 512.  Excel document streams should end with EOFRecord.  One possibility is that the originating application has written EOFRecord correctly, but padded with (non-zero) garbage.  Alternatively there could be a mis-alignment problem a little bit before this location.

Could you please attach the offending spreadsheet.  This will help diagnose the problem better.
Comment 2 Trejkaz (pen name) 2009-04-13 22:18:41 UTC
I can't give you the document because it contains company information, but I put a breakpoint in RecordFactory to confirm that it is doing this:

record = [SELECTION]
    .pane            = 0x03
    .activecellrow   = 0x0000
    .activecellcol   = 0x0000
    .activecellref   = 0x0000
    .numrefs         = 0x0001
[/SELECTION]

record = [EOF]
[/EOF]


java.lang.RuntimeException: Buffer underrun - requested 512 bytes but 192 was available
	at org.apache.poi.poifs.filesystem.DocumentInputStream.checkAvaliable(DocumentInputStream.java:202)
	at org.apache.poi.poifs.filesystem.DocumentInputStream.readFully(DocumentInputStream.java:224)
	at org.apache.poi.hssf.record.RecordInputStream.readFully(RecordInputStream.java:251)
	at org.apache.poi.hssf.record.RecordInputStream.readFully(RecordInputStream.java:246)
	at org.apache.poi.hssf.record.RecordInputStream.readRemainder(RecordInputStream.java:341)
	at org.apache.poi.hssf.record.UnknownRecord.<init>(UnknownRecord.java:73)
	at org.apache.poi.hssf.record.RecordFactory.createSingleRecord(RecordFactory.java:251)
	at org.apache.poi.hssf.record.RecordFactory.createRecords(RecordFactory.java:373)


So the old version was (correctly?) stopping at EOF, and the new version for whatever reason keeps reading despite hitting the EOF marker.
Comment 3 Trejkaz (pen name) 2009-04-13 23:20:11 UTC
Created attachment 23483 [details]
test file

OK, I have spent some time manually x'ing over the names and emails throughout the file.  Getting rid of the phone numbers was particularly hard (RK records... yuck.)

This should allow you to reproduce the problem locally.
Comment 4 Josh Micich 2009-04-17 00:01:30 UTC
Thanks for supplying the sample file.  Excel opens it without complaint, but when re-saving, the problem is corrected.  POI should do the same.

Fixed in svn r765866

junit added