Bug 47415 - Reading XLS files throws RecordFormatException
Summary: Reading XLS files throws RecordFormatException
Alias: None
Product: POI
Classification: Unclassified
Component: HSSF (show other bugs)
Version: 3.5-dev
Hardware: Macintosh Mac OS X 10.4
: P1 critical with 4 votes (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2009-06-24 04:37 UTC by Luc Girardin
Modified: 2009-06-24 14:14 UTC (History)
1 user (show)

Excel document that throws RecordFormatException (24.50 KB, application/vnd.ms-excel)
2009-06-24 04:37 UTC, Luc Girardin
The original file printed in landscape and printed once from 12.1.9 Excel 2008 - 090515 (25.00 KB, application/vnd.ms-excel)
2009-06-24 13:56 UTC, David Fisher

Note You need to log in before you can comment on or make changes to this bug.
Description Luc Girardin 2009-06-24 04:37:37 UTC
Created attachment 23866 [details]
Excel document that throws RecordFormatException

Trying to read various XLS files (example attached) created by Excel 2008 for Mac (v. 12.1.9) using POI 3.5 beta 6 systematically throws the following exception with :

org.apache.poi.hssf.record.RecordFormatException: Duplicate PageSettingsBlock record (sid=0x4d)
	at org.apache.poi.hssf.record.aggregates.PageSettingsBlock.checkNotPresent(PageSettingsBlock.java:202)
	at org.apache.poi.hssf.record.aggregates.PageSettingsBlock.readARecord(PageSettingsBlock.java:168)
	at org.apache.poi.hssf.record.aggregates.PageSettingsBlock.<init>(PageSettingsBlock.java:80)
	at org.apache.poi.hssf.model.Sheet.<init>(Sheet.java:221)
	at org.apache.poi.hssf.model.Sheet.createSheet(Sheet.java:161)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:288)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:202)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:318)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:299)
	at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:60)

POI 3.5 beta 5 had no problem with these files. The same document in XLSX format reads just fine.
Comment 1 Josh Micich 2009-06-24 12:47:45 UTC
Fixed in svn r788157

junit added

Do you have any details as to how to add setup info for two printers on the one worksheet?  It wasn't obvious how to do this in my Excel(Win/2007).
Comment 2 David Fisher 2009-06-24 13:56:38 UTC
Created attachment 23872 [details]
The original file printed in landscape and printed once from 12.1.9 Excel 2008 - 090515

Josh,I opened and saved the OP's file using Excel 12.1.9 (090515) on my Office 2008. The resulting file was slightly larger. This version was switched to landscape and printed before saving. Maybe this will help.

BTW - there are two spots where Excel for mac allows the page setup to be changed, one is on the Print dialog and the other is from the File menu's Page Setup item. Could this be the difference? Could the Print dialog save a driver specific form of the PageSettings block?

Let me know if you want me to perform more careful testing.
Comment 3 Josh Micich 2009-06-24 14:14:45 UTC
(In reply to comment #2)
> Created an attachment (id=23872) [details]

So the new attachment has both PLS records present but minor changes seem to be made in each.  When I re-saved the file with my Excel(Win/2007), both PLS records were removed.  This might have happened because my machine didn't know anything about the printers specified (in those records).

I guess it's not too important right now (how multiple PLS records work) but this information might be useful in the future if someone wants to extend POI support for print setup.