Bug 47244 - NullPointerException with HSSFHeader
Summary: NullPointerException with HSSFHeader
Alias: None
Product: POI
Classification: Unclassified
Component: HSSF (show other bugs)
Version: 3.5-dev
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
Keywords: PatchAvailable
Depends on:
Reported: 2009-05-22 09:59 UTC by Jonathan Holloway
Modified: 2009-06-03 20:43 UTC (History)
1 user (show)


Note You need to log in before you can comment on or make changes to this bug.
Description Jonathan Holloway 2009-05-22 09:59:26 UTC
Apologies if this is a duplicate, HSSFHeader and HSSFFooter seem to throw NullPointerException with some spreadsheets.

Exception caught
	at org.apache.poi.hssf.usermodel.HSSFFooter.<init>(HSSFFooter.java:44)
	at org.apache.poi.hssf.usermodel.HSSFSheet.getFooter(HSSFSheet.java:934)
	at org.apache.poi.hssf.extractor.ExcelExtractor.getText(ExcelExtractor.java:386)

This happens when I'm using the org.apache.poi.hssf.extractor.ExcelExtractor in particular.  FooterRecord/HeaderReccord appear to be null in these scenarios. 

BiffViewer fails with:

     [java] java.io.FileNotFoundException: no such entry: "Workbook"
     [java] 	at org.apache.poi.poifs.filesystem.DirectoryNode.getEntry(DirectoryNode.java:272)
     [java] 	at org.apache.poi.poifs.filesystem.DirectoryNode.createDocumentInputStream(DirectoryNode.java:124)
     [java] 	at org.apache.poi.poifs.filesystem.POIFSFileSystem.createDocumentInputStream(POIFSFileSystem.java:458)

If I open the original attachment in OpenOffice and save it this cures the issue, so it's obviously some bad data in the spreadsheet.  I can't, unfortunately send you the spreadsheet.

Attached is a patch that cures the issue, I'm not sure whether it is appropriate to ignore the header/footer and return null, but it appears to cure the issue and I get my spreadsheet output in the correct format.  Please let me know if you need anything else.

Index: HSSFSheet.java
--- HSSFSheet.java	(revision 776494)
+++ HSSFSheet.java	(working copy)
@@ -37,6 +37,8 @@
 import org.apache.poi.hssf.record.DVRecord;
 import org.apache.poi.hssf.record.EscherAggregate;
 import org.apache.poi.hssf.record.ExtendedFormatRecord;
+import org.apache.poi.hssf.record.FooterRecord;
+import org.apache.poi.hssf.record.HeaderRecord;
 import org.apache.poi.hssf.record.NoteRecord;
 import org.apache.poi.hssf.record.Record;
 import org.apache.poi.hssf.record.RowRecord;
@@ -917,7 +919,11 @@
      * @return The Document header.
     public HSSFHeader getHeader() {
-        return new HSSFHeader(_sheet.getPageSettings().getHeader());
+        HeaderRecord header = _sheet.getPageSettings().getHeader();
+        if (header != null) {
+            return new HSSFHeader(header);
+        }
+        return null;
@@ -925,7 +931,11 @@
      * @return The Document footer.
     public HSSFFooter getFooter() {
-        return new HSSFFooter(_sheet.getPageSettings().getFooter());
+        FooterRecord footer = _sheet.getPageSettings().getFooter();
+        if (footer != null) {
+            return new HSSFFooter(footer);
+        }
+        return null;
Comment 1 Yegor Kozlov 2009-05-30 05:05:11 UTC
Can you attach a file that causes the problem? The suggested patch looks fine, but I would like to have a clear test case before applying it. 

Comment 2 Josh Micich 2009-05-31 01:04:19 UTC
Sorry Jonathan and Yegor, I meant to reply earlier, but hadn't got round to it.

The proposed patch is NQR because it creates a header/footer object which is detached from the HSSFSheet.  The newly created (empty) header / footer record needs to be placed into the PageSettingsBlock, otherwise updating it will have no effect.  That's easy enough, but it's messy to create objects that represent their own absence.

I noticed that header/footer API on Sheet is a little deficient.  It's hard to tell whether the usermodel objects are supposed to have a similar life-cycle as the underlying biff records.  There's no methods on Sheet for update, delete or doesExist, and I think it would clutter things to add these.  The getHeader / getFooter methods seem to be called only twice from the full POI code-base. ExcelExtractor assumes that getFooter() will return null when the footer is not present.  HeadersAndFooters example expects that the footer will get lazily created by HSSFSheet.getFooter(). This seems to suggest that HSSFHeader / HSSFFooter should exist independent of HeaderRecord / FooterRecord.

I think a better solution might involve HSSFHeader / HSSFFooter directly wrapping the PageSettingsBlock (which is lazily created, and already manages the presence of all such contained records).  The HSSFHeader / HSSFFooter would know how get / update / create and destroy the header / footer records, and tell the PageSettingsBlock as needed.  This approach probably holds up better in the XSSF world where headers and footers are further specialised to odd/even etc (i.e. XSSFHeader should know about it's odd/even variants, not XSSFSheet).

We should also update the javadoc of getHeader / getFooter to say @return never null.
Comment 3 Josh Micich 2009-06-03 20:43:51 UTC
Fixed in svn r781645

junits added

POI now tolerates missing header / footer records, and also adds them when writing.

Also changed assumption (see bug 45777) that the header / footer text cannot exceed 256 bytes.