Bug 60891 - LittleEndianInputStream's readFully() doesn't
Summary: LittleEndianInputStream's readFully() doesn't
Alias: None
Product: POI
Classification: Unclassified
Component: POIFS (show other bugs)
Version: 3.16-dev
Hardware: PC All
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2017-03-20 20:17 UTC by Tim Allison
Modified: 2017-03-20 20:49 UTC (History)
0 users


Note You need to log in before you can comment on or make changes to this bug.
Description Tim Allison 2017-03-20 20:17:38 UTC
While trying to write an example of TestSecureTempZip's protectedTempZip for the new XLSB parser, I got an exception which I tracked down to LittleEndianInputStream's readFully, doesn't actually read fully; rather, it makes a single call to read and then checks if that hit EOF.  

When I fixed it (i.e. copied/pasted from commons-io IOUtils readFully) to read until the length was hit or an EOF was hit, the example worked with an encrypted xlsb.  All unit (including stress) tests pass.

Any objections to fixing this?  

Or, does this subtly change the expectations of readFully, and I should leave this as is?

This seems like such a core component, we should have found this earlier?!
Comment 1 Tim Allison 2017-03-20 20:29:54 UTC
This was the stacktrace I got before the fix:

java.lang.RuntimeException: Unexpected end-of-file

	at org.apache.poi.util.LittleEndianInputStream.checkEOF(LittleEndianInputStream.java:122)
	at org.apache.poi.util.LittleEndianInputStream.readFully(LittleEndianInputStream.java:134)
	at org.apache.poi.util.LittleEndianInputStream.readFully(LittleEndianInputStream.java:128)
	at org.apache.poi.xssf.binary.XSSFBParser.readNext(XSSFBParser.java:93)
	at org.apache.poi.xssf.binary.XSSFBParser.parse(XSSFBParser.java:62)
	at org.apache.poi.xssf.binary.XSSFBSharedStringsTable.readFrom(XSSFBSharedStringsTable.java:82)
	at org.apache.poi.xssf.binary.XSSFBSharedStringsTable.<init>(XSSFBSharedStringsTable.java:69)
	at org.apache.poi.xssf.extractor.XSSFBEventBasedExcelExtractor.getText(XSSFBEventBasedExcelExtractor.java:120)
	at org.apache.poi.poifs.crypt.TestSecureTempZip.protectedXLSBZip(TestSecureTempZip.java:114)
Comment 2 Dominik Stadler 2017-03-20 20:42:24 UTC
+1 for fixing it. 

The fact that read() might not read as much data as requested manifests itself only very rarely, mostly when you read larger chunks than what the OS buffers or similar. I saw this in a number of other places already where it also went undetected for a long time before causing hard to detect problems.

If the change causes other code-parts to misbehave I think we should rather fix those as well, unit tests and large-scale tests ought to cover most cases anyway nowadays.
Comment 3 Tim Allison 2017-03-20 20:49:05 UTC
Thank you, Dominik.