64130 – Regression in OldSheetRecord

Bug 64130 - Regression in OldSheetRecord

Summary: Regression in OldSheetRecord

Status:	RESOLVED FIXED

Alias:	None

Product:	POI
Classification:	Unclassified
Component:	HSSF (show other bugs)
Version:	4.1.x-dev
Hardware:	PC Linux

Importance:	P2 normal (vote)
Target Milestone:	---
Assignee:	POI Developers List

URL:
Keywords:

Depends on:
Blocks:

Reported:	2020-02-10 15:18 UTC by Tim Allison
Modified:	2020-02-10 17:39 UTC (History)
CC List:	0 users

Attachments
example file embedded in govdocs1 296107.doc (7.00 KB, application/vnd.ms-excel) 2020-02-10 15:18 UTC, Tim Allison	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Tim Allison 2020-02-10 15:18:31 UTC

Created attachment 36998 [details]
example file embedded in govdocs1 296107.doc

We identified a fairly common regression in parsing old excel files in the most recent regression tests for POI 4.1.2-rc2.

With r1872302, readByte() was introduced to OldSheetRecord after reading the "field_4_sheetname_length".  We should check if the sheetname length == 0 before trying to read the byte.

This causes ~550 new exceptions on the regression corpus.

Stacktrace:

Caused by: org.apache.poi.util.RecordFormatException
	at org.apache.poi.hssf.record.RecordInputStream.checkRecordPosition(RecordInputStream.java:246)
	at org.apache.poi.hssf.record.RecordInputStream.readByte(RecordInputStream.java:255)
	at org.apache.poi.hssf.record.OldSheetRecord.<init>(OldSheetRecord.java:51)
	at org.apache.poi.hssf.extractor.OldExcelExtractor.getText(OldExcelExtractor.java:242)
	at o.a.t.parser.microsoft.OldExcelParser.parse(OldExcelParser.java:57)
	at o.a.t.parser.microsoft.ExcelExtractor.parse(ExcelExtractor.java:157)
	at o.a.t.parser.microsoft.OfficeParser.parse(OfficeParser.java:183)
	at o.a.t.parser.microsoft.OfficeParser.parse(OfficeParser.java:131)
	at o.a.t.parser.CompositeParser.parse(CompositeParser.java:280)

Comment 1 Tim Allison 2020-02-10 17:39:16 UTC

Fixed in r1873863