Bug 65543 - RecordFormatException: Not enough data (0) to read requested (2) bytes
Summary: RecordFormatException: Not enough data (0) to read requested (2) bytes
Status: NEW
Alias: None
Product: POI
Classification: Unclassified
Component: HSSF (show other bugs)
Version: 5.0.0-FINAL
Hardware: PC All
: P2 major (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2021-09-01 07:32 UTC by Egor Yashkov
Modified: 2021-10-09 14:52 UTC (History)
0 users

example of files (41.79 KB, application/x-7z-compressed)
2021-09-01 07:32 UTC, Egor Yashkov

Note You need to log in before you can comment on or make changes to this bug.
Description Egor Yashkov 2021-09-01 07:32:00 UTC
Created attachment 38006 [details]
example of files


sometimes we get an error for the file that was modified in Excel (Microsoft Office 365). Library org.apache.poi:poi v5.0.0 (and the same result for other versions) might have some issue. 

Simple example below to reproduce our issue:

import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.ss.usermodel.Workbook;

import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;

public class CheckXLSReading {
    public static void main(String[] args) throws IOException {
        InputStream inputStream = new FileInputStream("D:\\file_to_check.xls");

        Workbook workbook = new HSSFWorkbook(inputStream);


1) First file file_to_check_365.xls has been modified in "Microsoft Excel for Microsoft 365 MSO (16.0.14228.20216) 64-bit". And we have the following error for this file:

Console output
Exception in thread "main" org.apache.poi.util.RecordFormatException: Not enough data (0) to read requested (2) bytes
	at org.apache.poi.hssf.record.RecordInputStream.checkRecordPosition(RecordInputStream.java:246)
	at org.apache.poi.hssf.record.RecordInputStream.readShort(RecordInputStream.java:265)
	at org.apache.poi.hssf.record.common.UnicodeString.<init>(UnicodeString.java:77)
	at org.apache.poi.hssf.record.SSTDeserializer.manufactureStrings(SSTDeserializer.java:57)
	at org.apache.poi.hssf.record.SSTRecord.<init>(SSTRecord.java:235)
	at org.apache.poi.hssf.record.RecordFactory.createSingleRecord(RecordFactory.java:79)
	at org.apache.poi.hssf.record.RecordFactoryInputStream.readNextRecord(RecordFactoryInputStream.java:289)
	at org.apache.poi.hssf.record.RecordFactoryInputStream.nextRecord(RecordFactoryInputStream.java:255)
	at org.apache.poi.hssf.record.RecordFactory.createRecords(RecordFactory.java:166)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:343)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:399)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:381)
	at CheckXLSReading.main(CheckXLSReading.java:12)

2) Second file file_to_check_2016.xls has been re-saved from the previous file only using "Microsoft Excel 2016 (16.0.5188.1000) MSO (16.0.5188.1000) 32-bit". And after that we don't have any errors.

Console output

Could you please check this issue. Thank you in advance!