Bug 60833 - Initialisation of record 0x31 left 4 bytes remaining still to be read.
Summary: Initialisation of record 0x31 left 4 bytes remaining still to be read.
Status: NEW
Alias: None
Product: POI
Classification: Unclassified
Component: HSSF (show other bugs)
Version: unspecified
Hardware: PC All
: P2 major with 4 votes (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-03-08 07:58 UTC by yoker.wu
Modified: 2020-07-29 18:56 UTC (History)
1 user (show)



Attachments
this is a report table received from email. (17.50 KB, application/x-ole-storage)
2017-03-08 07:58 UTC, yoker.wu
Details
The Excel BIFF output file created by BIFFVIEW.exe (234.64 KB, text/html)
2017-03-09 10:43 UTC, yoker.wu
Details

Note You need to log in before you can comment on or make changes to this bug.
Description yoker.wu 2017-03-08 07:58:57 UTC
Created attachment 34808 [details]
this is a report table received from email.
Comment 1 Javen O'Neal 2017-03-08 09:21:45 UTC
Thanks for the bug report and including the problematic file.

Before I spend time researching this, it would help us if you could answer a few questions.

I'm assuming you get an exception (what exception class?) with the message "Initialisation of record 0x31 left 4 bytes remaining still to be read" when you open the workbook with
> Workbook wb = WorkbookFactory.create(new File("buzhengc.xls"));
If not, please include a stack trace and sample code so that we can reproduce the problem.

What version of POI are you using?

Is there anything written to stderr or the POILogger, that would suggest why we were 4 bytes short?

Does this file open without any errors in Microsoft Excel or other spreadsheet application? If so, what version?

Thanks in advance for the info.
Comment 2 Dominik Stadler 2017-03-08 14:08:03 UTC
FYI, Bug 57093 sounds similar.
Comment 3 yoker.wu 2017-03-09 01:07:11 UTC
@Javen O'Neal 

this file can open without any errors in Microsoft Excel or WPS application.

i tested on poi-3.15.jar.

this is my code.

public static void main(String[] args) throws Exception{
	
	String fileName = "F:\\Desktop\\buzhengc.xls";

	InputStream inputStream = new FileInputStream(fileName);
	
	POIFSFileSystem fs = new POIFSFileSystem(inputStream);
    
	DirectoryEntry root = fs.getRoot();
	System.out.println(root.getEntryNames());
	
	HSSFWorkbook hssfworkbook = new HSSFWorkbook(fs);
	System.out.println(hssfworkbook.getNameName(0));
	
}


i got this errors:

[Workbook]
Exception in thread "main" org.apache.poi.hssf.record.RecordInputStream$LeftoverDataException: Initialisation of record 0x31(FontRecord) left 4 bytes remaining still to be read.
	at org.apache.poi.hssf.record.RecordInputStream.hasNextRecord(RecordInputStream.java:174)
	at org.apache.poi.hssf.record.RecordFactoryInputStream.nextRecord(RecordFactoryInputStream.java:253)
	at org.apache.poi.hssf.record.RecordFactory.createRecords(RecordFactory.java:494)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:341)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:304)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:251)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:221)
	at com.test.pppp.main(test.java:26)
Comment 4 yoker.wu 2017-03-09 10:43:23 UTC
Created attachment 34812 [details]
The Excel BIFF output file created by BIFFVIEW.exe

I found another problem.

there is a WINDOW2 record before at ROW record.
Comment 5 lintongchuan 2017-05-05 01:48:59 UTC
I have the same exception when reading xls file
the stack trace 
org.apache.poi.hssf.record.RecordInputStream$LeftoverDataException: Initialisation of record 0x31(FontRecord) left 4 bytes remaining still to be read.
	at org.apache.poi.hssf.record.RecordInputStream.hasNextRecord(RecordInputStream.java:177) ~[poi-3.16.jar:3.16]
	at org.apache.poi.hssf.record.RecordFactoryInputStream.nextRecord(RecordFactoryInputStream.java:234) ~[poi-3.16.jar:3.16]
	at org.apache.poi.hssf.record.RecordFactory.createRecords(RecordFactory.java:494) ~[poi-3.16.jar:3.16]
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:348) ~[poi-3.16.jar:3.16]
	at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:99) ~[poi-ooxml-3.16.jar:3.16]
	at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:182) ~[poi-ooxml-3.16.jar:3.16]
	at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:144) ~[poi-ooxml-3.16.jar:3.16]
Comment 6 lintongchuan 2017-05-05 01:55:34 UTC
There is no problem opening the file using Microsoft Excel on Mac operating system
Comment 7 lintongchuan 2017-05-05 02:11:17 UTC
(In reply to lintongchuan from comment #5)
> I have the same exception when reading xls file
> the stack trace 
> org.apache.poi.hssf.record.RecordInputStream$LeftoverDataException:
> Initialisation of record 0x31(FontRecord) left 4 bytes remaining still to be
> read.
> 	at
> org.apache.poi.hssf.record.RecordInputStream.hasNextRecord(RecordInputStream.
> java:177) ~[poi-3.16.jar:3.16]
> 	at
> org.apache.poi.hssf.record.RecordFactoryInputStream.
> nextRecord(RecordFactoryInputStream.java:234) ~[poi-3.16.jar:3.16]
> 	at
> org.apache.poi.hssf.record.RecordFactory.createRecords(RecordFactory.java:
> 494) ~[poi-3.16.jar:3.16]
> 	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:348)
> ~[poi-3.16.jar:3.16]
> 	at
> org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:99)
> ~[poi-ooxml-3.16.jar:3.16]
> 	at
> org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:182)
> ~[poi-ooxml-3.16.jar:3.16]
> 	at
> org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:144)
> ~[poi-ooxml-3.16.jar:3.16]

There is no problem opening the file using Microsoft Excel on Mac operating system,
when I open the excel file using Microsoft Excel , then save it as another file, another file can be correctly read and process.
Comment 8 Tim Allison 2017-05-05 11:29:00 UTC
We have 15 stacktraces like this in our regression corpus for Tika.  I was hoping from the file attached here (first) and the file attached on Bug 57093 (second), that the first byte or two specified a length somehow.

However, from govdocs1 085890.xls (third), it looks like junk at the end of the font record.  By junk, of course, I mean, "I don't understand why it's there"...like junk DNA. :)  But seriously, in 085890.xls, when I open the file in Excel and search for "providing", I don't find anything.

First line is font name : length
Remaining lines are: byte index : byte&0xff : char (if above 20)

FONT NAME:黑体 : 2
0 : 0 :  
1 : 0 :  
2 : 0 :  
3 : 0 : 

FONT NAME:MS Sans Serif : 13
0 : 19 :  
1 : 0 :  
2 : 1 :  
3 : 0 :  
4 : 0 :  
5 : 88 : X
6 : 1 :  
7 : 0 :  
8 : 0 :  
9 : 89 : Y
10 : 95 : _
11 : 41 : )
12 : 63 : ?
13 : 95 : _
14 : 41 : )
15 : 59 : ;
16 : 95 : _
17 : 40 : (
18 : 64 : @
19 : 95 : _
20 : 41 : )
21 : 0 :  

FONT NAME:MS Sans Serif : 13
0 : 116 : t
1 : 129 :  
2 : 84 : T
3 : 73 : I
4 : 84 : T
5 : 85 : U
6 : 84 : T
7 : 73 : I
8 : 79 : O
9 : 78 : N
10 : 95 : _
11 : 80 : P
12 : 82 : R
13 : 79 : O
14 : 86 : V
15 : 73 : I
16 : 68 : D
17 : 73 : I
18 : 78 : N
19 : 71 : G
20 : 95 : _
21 : 68 : D
22 : 65 : A
23 : 84 : T
24 : 65 : A
25 : 95 : _
26 : 73 : I
27 : 68 : D
28 : 10 :  
29 : 0 :  
30 : 0 :  
31 : 67 : C
32 : 79 : O
33 : 78 : N
34 : 84 : T
35 : 65 : A
36 : 67 : C
37 : 84 : T
38 : 95 : _
39 : 73 : I
40 : 68 : D
41 : 20 :
Comment 9 Tim Allison 2017-05-05 11:39:41 UTC
Whoa... and govdocs1/093/093996.xls has seven font records with an extra 1918 bytes!

No intelligible text (on a quick look)...

FONT NAME:MS Sans Serif : 13
0 : 149 :  
1 : 129 :  
2 : 95 : _
3 : 41 : )
4 : 59 : ;
5 : 95 : _
...
1893 : 0 :  
1894 : 115 : s
1895 : 142 :  
1896 : 78 : N
1897 : 0 :  
1898 : 75 : K
1899 : 161 :  
1900 : 78 : N
1901 : 0 :  
1902 : 3 :  
1903 : 180 :  
1904 : 78 : N
1905 : 0 :  
1906 : 227 :  
1907 : 198 :  
1908 : 78 : N
1909 : 0 :  
1910 : 145 :  
1911 : 214 :  
1912 : 78 : N
1913 : 0 :  
1914 : 48 : 0
1915 : 65 : A
1916 : 85 : U
1917 : 8 :
Comment 10 Tim Allison 2017-05-05 12:47:34 UTC
Stats from our regression corpus on which Excel records cause LeftoverDataExceptions (Record \t number of exceptions).

0x850(ChartFRTInfoRecord) 	763  (Bug 47247)
0x85(BoundSheetRecord) 	95
0x1D(SelectionRecord) 	35
0x31(FontRecord) 	15
0x203(NumberRecord) 	8
0x42(CodepageRecord) 	5
0x3C(ContinueRecord) 	2
0x868(FeatRecord) 	2
0x5B(FileSharingRecord) 	1
0x5F(SaveRecalcRecord) 	1
0xE(PrecisionRecord) 	1
Comment 11 Slava 2020-07-29 18:56:03 UTC
I'm experiencing same issue while using TIKA, the failure is very annoying and preventing from us to parse many excel files.
Thanks