Bug 48325

Summary: bad text 'Page &P of &N' and similar errors when reading in spreadsheets
Product: POI Reporter: Trejkaz (pen name) <trejkaz>
Component: HSSFAssignee: POI Developers List <dev>
Severity: regression CC: aelyacoubi, tanakas
Priority: P2    
Version: 3.6-FINAL   
Target Milestone: ---   
Hardware: Sun   
OS: Solaris   
Attachments: Sample file
Proposed patch

Description Trejkaz (pen name) 2009-12-01 19:41:07 UTC
We have seen this error in logs from customer sites, which prevent getting the text out (HeaderFooter only has API for getLeft(), getCenter() and getRight(), and all three throw the exception:

Caused by: java.lang.IllegalStateException: bad text 'Page &P of &N'.
    at org.apache.poi.hssf.usermodel.HeaderFooter.splitParts(HeaderFooter.java:77)
    at org.apache.poi.hssf.usermodel.HeaderFooter.getLeft(HeaderFooter.java:87)
    at com.nuix.data.file.office.poi.PoiExcelFileReader$SheetReader.createNextReader(SourceFile:273)
    at com.nuix.util.LazyCompositeReader.isReaderAvailable(SourceFile:93)
    at com.nuix.util.LazyCompositeReader.read(SourceFile:40)
    at com.nuix.util.LazyCompositeReader.read(SourceFile:42)
    at com.nuix.data.util.CachingReaderFactory.readFully(SourceFile:201)
    at com.nuix.data.util.CachingReaderFactory.createReader(SourceFile:104)
    ... 14 more

We have seen the same thing with '&A' as well.

This is a regression from before 3.5 beta 6, when these methods used to silently return null instead.  However I'm not sure that returning null is correct either - what would make the most sense is to match the behaviour of Excel in this situation, which is presumably to treat all the text as if it's the left or the center (but which?)
Comment 1 Trejkaz (pen name) 2009-12-03 19:20:09 UTC
Created attachment 24668 [details]
Sample file

Sample file attached, was created using an XLS file output from POI but then mangling the footer to have the problem described.

Test case would look something like this (though there is a small amount of our own utility method calls up the top):

    public void testIllegalFooterText() throws Exception
        HSSFWorkbook wb = loadWorkbook(getDataFile("office/illegal-footer-text.xls"));
        HSSFSheet sh = wb.getSheetAt(0);
        HSSFFooter f = sh.getFooter();

        // The legal form of the footer would have been "&CBlahBlah Blah Blah  " but something
        // left the &C off.  In this case it was me and a hex editor.  In the cases we have seen
        // in the wild it's probably some idiot who thought they knew how to add a footer but didn't.

        assertEquals("Left text should be empty", "", f.getLeft());
        assertEquals("Right text should be empty", "", f.getRight());
        assertEquals("Center text should contain the illegal value", "BlahBlah blah blah  ", f.getCenter());
Comment 2 Trejkaz (pen name) 2009-12-03 19:21:22 UTC
Created attachment 24669 [details]
Proposed patch

Patch which makes that test pass.
Comment 3 Nick Burch 2010-09-21 07:33:35 UTC
Thanks for the patch and test, applied in r999320.