On TIKA-1733, Christophe Lacroix reported this exception and attached a triggering document. I can reproduce this bug with POI's trunk. When I iterate through the ranges in the overall range (headerStories.getRange()), it looks like the paragraph for the first page footer should be at 2044 to 2045...then with proper handling (is this in table? yes, grab the table text), this shows that the full table offsets are 2044 through 2353. P OFFSETS:2044 -> 2045 table text: > < <ROW> Prosys - 23, rue du Capitaine Ferber - 92130 Issy-les-Moulineaux - Tel : +33 (0)1 41 23 27 77 - Fax : +33 (0)1 41 23 27 99 - www.prosys.fr - Une société du Groupe Moniteur</ROW> <ROW>SAS au capital de 780 000 euros - SIRET : 344 894 985 000 52 RCS Nanterre - APE : 6202 A - TVA intracommunautaire : FR 32 344 894 985</ROW> table Offsets: 2044 -> 2353 When I run this same iteration through headerStories.getFirstFooterSubrange(),the target range is 2045-> 2355. RANGE: 2045 : 2355 P OFFSETS:2045 : 2217 table text: >Prosys - 23, rue du Capitaine Ferber - 92130 Issy-les-Moulineaux - Tel : +33 (0)1 41 23 27 77 - Fax : +33 (0)1 41 23 27 99 - www.prosys.fr - Une société du Groupe Moniteur< java.lang.IllegalArgumentException: This paragraph is not the first one in the table at org.apache.poi.hwpf.usermodel.Range.getTable(Range.java:927) at org.apache.poi.hwpf.TestHWPFRangeParts.testTableInFirstPageFooter(TestHWPFRangeParts.java:214) Is the file corrupt or is this an area for improvement within POI Basic debugging code: public void testTableInFirstPageFooter() throws Exception { HWPFDocument d = HWPFTestDataSamples.openSampleFile("TIKA-1733.doc"); HeaderStories headerStories = new HeaderStories(d); // Range r = headerStories.getFirstFooterSubrange(); int i = 0; for (Range r : new Range[]{ // headerStories.getRange(), //headerStories.getFirstHeaderSubrange(), headerStories.getFirstFooterSubrange(), //headerStories.getOddFooterSubrange() }) { if (r != null) { System.out.println("RANGE: "+r.getStartOffset() + " : "+r.getEndOffset()); for (int j = 0; j < r.numParagraphs(); j++) { Paragraph p = r.getParagraph(j); System.out.println("P OFFSETS:" + p.getStartOffset() + " : " + p.getEndOffset()); if (p.isInTable()) { System.out.println("table text: >" + p.text().replaceAll("[\\r\\n]", " ")+"<"); Table t = r.getTable(p); for (int rNum = 0; rNum < t.numRows(); rNum++) { TableRow row = t.getRow(rNum); System.out.println("<ROW>"+row.text().replaceAll("[\\r\\n]", " ") +"</ROW>"); } System.out.println("table Offsets: " + t.getStartOffset() + " -> " + t.getEndOffset()); j += t.numParagraphs()-1; } } } i++; }
Any help available on this one? This area of the code base is new to me. Thank you!