Bug 58355 - Paragraph not first in table exception when processing first page footer
Summary: Paragraph not first in table exception when processing first page footer
Status: NEW
Alias: None
Product: POI
Classification: Unclassified
Component: HWPF (show other bugs)
Version: unspecified
Hardware: PC All
: P2 minor (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-09-10 16:08 UTC by Tim Allison
Modified: 2016-01-05 14:32 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tim Allison 2015-09-10 16:08:32 UTC
On TIKA-1733, Christophe Lacroix reported this exception and attached a triggering document.

I can reproduce this bug with POI's trunk.

When I iterate through the ranges in the overall range (headerStories.getRange()), it looks like the paragraph for the first page footer should be at 2044 to 2045...then with proper handling (is this in table? yes, grab the table text), this shows that the full table offsets are 2044 through 2353.

P OFFSETS:2044 -> 2045
table text: > <
<ROW> Prosys - 23, rue du Capitaine Ferber - 92130 Issy-les-Moulineaux - Tel : +33 (0)1 41 23 27 77 - Fax : +33 (0)1 41 23 27 99 - www.prosys.fr - Une société du Groupe Moniteur</ROW>
<ROW>SAS au capital de 780 000 euros - SIRET : 344 894 985 000 52 RCS Nanterre - APE : 6202 A - TVA intracommunautaire : FR 32 344 894 985</ROW>
table Offsets: 2044 -> 2353

When I run this same iteration through headerStories.getFirstFooterSubrange(),the target range is 2045-> 2355.

RANGE: 2045 : 2355
P OFFSETS:2045 : 2217
table text: >Prosys - 23, rue du Capitaine Ferber - 92130 Issy-les-Moulineaux - Tel : +33 (0)1 41 23 27 77 - Fax : +33 (0)1 41 23 27 99 - www.prosys.fr - Une société du Groupe Moniteur<

java.lang.IllegalArgumentException: This paragraph is not the first one in the table
	at org.apache.poi.hwpf.usermodel.Range.getTable(Range.java:927)
	at org.apache.poi.hwpf.TestHWPFRangeParts.testTableInFirstPageFooter(TestHWPFRangeParts.java:214)


Is the file corrupt or is this an area for improvement within POI

Basic debugging code:
	public void testTableInFirstPageFooter() throws Exception {
		HWPFDocument d = HWPFTestDataSamples.openSampleFile("TIKA-1733.doc");
		HeaderStories headerStories = new HeaderStories(d);
//		Range r = headerStories.getFirstFooterSubrange();
        int i = 0;
        for (Range r : new Range[]{
              //  headerStories.getRange(),
                //headerStories.getFirstHeaderSubrange(),
                headerStories.getFirstFooterSubrange(),
                //headerStories.getOddFooterSubrange()
        }) {
            if (r != null) {
                System.out.println("RANGE: "+r.getStartOffset() + " : "+r.getEndOffset());
                for (int j = 0; j < r.numParagraphs(); j++) {
                    Paragraph p = r.getParagraph(j);
                    System.out.println("P OFFSETS:" +
                            p.getStartOffset() + " : " +
                            p.getEndOffset());
                    if (p.isInTable()) {
                        System.out.println("table text: >" + p.text().replaceAll("[\\r\\n]", " ")+"<");
                        Table t = r.getTable(p);
                        for (int rNum = 0; rNum < t.numRows(); rNum++) {
                            TableRow row = t.getRow(rNum);
                            System.out.println("<ROW>"+row.text().replaceAll("[\\r\\n]", " ")
                                    +"</ROW>");
                        }
                        System.out.println("table Offsets: " + t.getStartOffset() + " -> " + t.getEndOffset());
                        j += t.numParagraphs()-1;
                    }
                }
            }
            i++;
        }
Comment 1 Tim Allison 2016-01-05 14:32:41 UTC
Any help available on this one?  This area of the code base is new to me.  Thank you!