Bug 47415

Summary: Reading XLS files throws RecordFormatException
Product: POI Reporter: Luc Girardin <luc.girardin>
Component: HSSFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: critical CC: luc.girardin
Priority: P1    
Version: 3.5-dev   
Target Milestone: ---   
Hardware: Macintosh   
OS: Mac OS X 10.4   
Attachments: Excel document that throws RecordFormatException
The original file printed in landscape and printed once from 12.1.9 Excel 2008 - 090515

Description Luc Girardin 2009-06-24 04:37:37 UTC
Created attachment 23866 [details]
Excel document that throws RecordFormatException

Trying to read various XLS files (example attached) created by Excel 2008 for Mac (v. 12.1.9) using POI 3.5 beta 6 systematically throws the following exception with :

org.apache.poi.hssf.record.RecordFormatException: Duplicate PageSettingsBlock record (sid=0x4d)
	at org.apache.poi.hssf.record.aggregates.PageSettingsBlock.checkNotPresent(PageSettingsBlock.java:202)
	at org.apache.poi.hssf.record.aggregates.PageSettingsBlock.readARecord(PageSettingsBlock.java:168)
	at org.apache.poi.hssf.record.aggregates.PageSettingsBlock.<init>(PageSettingsBlock.java:80)
	at org.apache.poi.hssf.model.Sheet.<init>(Sheet.java:221)
	at org.apache.poi.hssf.model.Sheet.createSheet(Sheet.java:161)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:288)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:202)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:318)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:299)
	at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:60)

POI 3.5 beta 5 had no problem with these files. The same document in XLSX format reads just fine.
Comment 1 Josh Micich 2009-06-24 12:47:45 UTC
Fixed in svn r788157

junit added

Do you have any details as to how to add setup info for two printers on the one worksheet?  It wasn't obvious how to do this in my Excel(Win/2007).
Comment 2 David Fisher 2009-06-24 13:56:38 UTC
Created attachment 23872 [details]
The original file printed in landscape and printed once from 12.1.9 Excel 2008 - 090515

Josh,I opened and saved the OP's file using Excel 12.1.9 (090515) on my Office 2008. The resulting file was slightly larger. This version was switched to landscape and printed before saving. Maybe this will help.

BTW - there are two spots where Excel for mac allows the page setup to be changed, one is on the Print dialog and the other is from the File menu's Page Setup item. Could this be the difference? Could the Print dialog save a driver specific form of the PageSettings block?

Let me know if you want me to perform more careful testing.
Comment 3 Josh Micich 2009-06-24 14:14:45 UTC
(In reply to comment #2)
> Created an attachment (id=23872) [details]

So the new attachment has both PLS records present but minor changes seem to be made in each.  When I re-saved the file with my Excel(Win/2007), both PLS records were removed.  This might have happened because my machine didn't know anything about the printers specified (in those records).

I guess it's not too important right now (how multiple PLS records work) but this information might be useful in the future if someone wants to extend POI support for print setup.