While trying to write an example of TestSecureTempZip's protectedTempZip for the new XLSB parser, I got an exception which I tracked down to LittleEndianInputStream's readFully, doesn't actually read fully; rather, it makes a single call to read and then checks if that hit EOF. When I fixed it (i.e. copied/pasted from commons-io IOUtils readFully) to read until the length was hit or an EOF was hit, the example worked with an encrypted xlsb. All unit (including stress) tests pass. Any objections to fixing this? Or, does this subtly change the expectations of readFully, and I should leave this as is? This seems like such a core component, we should have found this earlier?!
This was the stacktrace I got before the fix: java.lang.RuntimeException: Unexpected end-of-file at org.apache.poi.util.LittleEndianInputStream.checkEOF(LittleEndianInputStream.java:122) at org.apache.poi.util.LittleEndianInputStream.readFully(LittleEndianInputStream.java:134) at org.apache.poi.util.LittleEndianInputStream.readFully(LittleEndianInputStream.java:128) at org.apache.poi.xssf.binary.XSSFBParser.readNext(XSSFBParser.java:93) at org.apache.poi.xssf.binary.XSSFBParser.parse(XSSFBParser.java:62) at org.apache.poi.xssf.binary.XSSFBSharedStringsTable.readFrom(XSSFBSharedStringsTable.java:82) at org.apache.poi.xssf.binary.XSSFBSharedStringsTable.<init>(XSSFBSharedStringsTable.java:69) at org.apache.poi.xssf.extractor.XSSFBEventBasedExcelExtractor.getText(XSSFBEventBasedExcelExtractor.java:120) at org.apache.poi.poifs.crypt.TestSecureTempZip.protectedXLSBZip(TestSecureTempZip.java:114)
+1 for fixing it. The fact that read() might not read as much data as requested manifests itself only very rarely, mostly when you read larger chunks than what the OS buffers or similar. I saw this in a number of other places already where it also went undetected for a long time before causing hard to detect problems. If the change causes other code-parts to misbehave I think we should rather fix those as well, unit tests and large-scale tests ought to cover most cases anyway nowadays.
Thank you, Dominik. r1787846