Bug 64130 - Regression in OldSheetRecord
Summary: Regression in OldSheetRecord
Alias: None
Product: POI
Classification: Unclassified
Component: HSSF (show other bugs)
Version: 4.1.x-dev
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2020-02-10 15:18 UTC by Tim Allison
Modified: 2020-02-10 17:39 UTC (History)
0 users

example file embedded in govdocs1 296107.doc (7.00 KB, application/vnd.ms-excel)
2020-02-10 15:18 UTC, Tim Allison

Note You need to log in before you can comment on or make changes to this bug.
Description Tim Allison 2020-02-10 15:18:31 UTC
Created attachment 36998 [details]
example file embedded in govdocs1 296107.doc

We identified a fairly common regression in parsing old excel files in the most recent regression tests for POI 4.1.2-rc2.

With r1872302, readByte() was introduced to OldSheetRecord after reading the "field_4_sheetname_length".  We should check if the sheetname length == 0 before trying to read the byte.

This causes ~550 new exceptions on the regression corpus.


Caused by: org.apache.poi.util.RecordFormatException
	at org.apache.poi.hssf.record.RecordInputStream.checkRecordPosition(RecordInputStream.java:246)
	at org.apache.poi.hssf.record.RecordInputStream.readByte(RecordInputStream.java:255)
	at org.apache.poi.hssf.record.OldSheetRecord.<init>(OldSheetRecord.java:51)
	at org.apache.poi.hssf.extractor.OldExcelExtractor.getText(OldExcelExtractor.java:242)
	at o.a.t.parser.microsoft.OldExcelParser.parse(OldExcelParser.java:57)
	at o.a.t.parser.microsoft.ExcelExtractor.parse(ExcelExtractor.java:157)
	at o.a.t.parser.microsoft.OfficeParser.parse(OfficeParser.java:183)
	at o.a.t.parser.microsoft.OfficeParser.parse(OfficeParser.java:131)
	at o.a.t.parser.CompositeParser.parse(CompositeParser.java:280)
Comment 1 Tim Allison 2020-02-10 17:39:16 UTC
Fixed in r1873863