Bug 64130

Summary: Regression in OldSheetRecord
Product: POI Reporter: Tim Allison <tallison>
Component: HSSFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: normal    
Priority: P2    
Version: 4.1.x-dev   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Attachments: example file embedded in govdocs1 296107.doc

Description Tim Allison 2020-02-10 15:18:31 UTC
Created attachment 36998 [details]
example file embedded in govdocs1 296107.doc

We identified a fairly common regression in parsing old excel files in the most recent regression tests for POI 4.1.2-rc2.

With r1872302, readByte() was introduced to OldSheetRecord after reading the "field_4_sheetname_length".  We should check if the sheetname length == 0 before trying to read the byte.

This causes ~550 new exceptions on the regression corpus.

Stacktrace:

Caused by: org.apache.poi.util.RecordFormatException
	at org.apache.poi.hssf.record.RecordInputStream.checkRecordPosition(RecordInputStream.java:246)
	at org.apache.poi.hssf.record.RecordInputStream.readByte(RecordInputStream.java:255)
	at org.apache.poi.hssf.record.OldSheetRecord.<init>(OldSheetRecord.java:51)
	at org.apache.poi.hssf.extractor.OldExcelExtractor.getText(OldExcelExtractor.java:242)
	at o.a.t.parser.microsoft.OldExcelParser.parse(OldExcelParser.java:57)
	at o.a.t.parser.microsoft.ExcelExtractor.parse(ExcelExtractor.java:157)
	at o.a.t.parser.microsoft.OfficeParser.parse(OfficeParser.java:183)
	at o.a.t.parser.microsoft.OfficeParser.parse(OfficeParser.java:131)
	at o.a.t.parser.CompositeParser.parse(CompositeParser.java:280)
Comment 1 Tim Allison 2020-02-10 17:39:16 UTC
Fixed in r1873863