Bug 60337

Summary: XWPFTableRow.isRepeatHeader throws NullPointerException, setRepeatHeader does not overwrite old value
Product: POI Reporter: Simon Gaeremynck <gaeremyncks>
Component: XWPFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: normal CC: gaeremyncks
Priority: P2    
Version: 3.16-dev   
Target Milestone: ---   
Hardware: Macintosh   
OS: All   
Attachments: A simple Word 2007 document that contains a table with a repeating header row
mylyn/context/zip

Description Simon Gaeremynck 2016-11-03 15:26:07 UTC
Created attachment 34418 [details]
A simple Word 2007 document that contains a table with a repeating header row

This bug report relates to the repeating header logic in XWPFTableRow.

When doing a .isRepeatHeader call on the provided Word document an NPE is thrown on line 229 (repeat = rpt.getVal().equals(STOnOff.ON);). It looks like .getVal() can be null, even though the repeating header is enabled in the document. I Generated this document in Microsoft Word for Mac v15.25 (160817).

Related to this, setRepeatHeader seems to append values rather than replace them. When you check whether the header is a repeating table header after a setRepeatHeader call you will always get the initial value as (CTOnOff rpt = trpr.getTblHeaderArray(0);) is used. I'm not sure whether that's according to the spec or not but I thought I'd mention it in the report nonetheless.

Below is a diff [1] that includes test cases for both bugs.

Let me know if you need any extra information.

Thanks!
Simon


diff --git a/src/ooxml/testcases/org/apache/poi/xwpf/usermodel/TestXWPFTableRow.java b/src/ooxml/testcases/org/apache/poi/xwpf/usermodel/TestXWPFTableRow.java
index b22f1a0..2f1242f 100644
--- a/src/ooxml/testcases/org/apache/poi/xwpf/usermodel/TestXWPFTableRow.java
+++ b/src/ooxml/testcases/org/apache/poi/xwpf/usermodel/TestXWPFTableRow.java
@@ -18,6 +18,8 @@
 package org.apache.poi.xwpf.usermodel;
 
 import junit.framework.TestCase;
+
+import org.apache.poi.xwpf.XWPFTestDataSamples;
 import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTRow;
 import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTTbl;
 
@@ -60,8 +62,29 @@ public class TestXWPFTableRow extends TestCase {
         XWPFTableRow tr = table.getRow(0);
         assertNotNull(tr);
 
+        // Assert the repeat header is false by default
+        boolean isRpt = tr.isRepeatHeader();
+        assertFalse(isRpt);
+
+        // Repeat the header
         tr.setRepeatHeader(true);
+        isRpt = tr.isRepeatHeader();
+        assertTrue(isRpt);
+
+        // Make the header no longer repeating
+        tr.setRepeatHeader(false);
+        isRpt = tr.isRepeatHeader();
+        assertFalse(isRpt);
+    }
+
+    // Test that validates the table header value can be parsed from a document generated in Word
+    public void testIsRepeatHeader() throws Exception {
+        XWPFDocument doc = XWPFTestDataSamples.openSampleDocument("Word2007TableWithHeader.docx");
+        XWPFTable table = doc.getTables().get(0);
+        assertNotNull(table);
+        XWPFTableRow tr = table.getRow(0);
+        assertNotNull(tr);
         boolean isRpt = tr.isRepeatHeader();
-        assert (isRpt);
+        assertTrue(isRpt);
     }
 }
Comment 1 Simon Gaeremynck 2016-11-03 15:38:43 UTC
I admit I'm fairly new to this but I think the specification [1] allows for just having a <w:tblHeader /> element (so no val="true" attribute).

The logic should probably be something like:
if (trpr.sizeOfTblHeaderArray() > 0) {
    // Use last index here?
    CTOnOff rpt = trpr.getTblHeaderArray(0);
    STOnOff.Enum val = rpt.getVal();
    repeat = (val == null || val.equals(STOnOff.ON));
}

[1] http://officeopenxml.com/WPtableRowProperties.php
Comment 2 Nick Burch 2016-11-03 17:38:15 UTC
Are you able to construct a file in word where trpr.sizeOfTblHeaderArray() is 2 or 3? That would let us verify the right behaviour in the fix, as well as allowing an even better unit test!
Comment 3 Mark Murphy 2016-11-04 22:29:26 UTC
I don't think that is a possibility for Word, and it doesn't make any sense. I don't know why the Schema is that way, there are several properties that can be arrays in this context, but in a document, it makes no sense.

Can't split row is another property that wiull have the same problems as repeating rows. I think I have this.
Comment 4 Mark Murphy 2016-11-05 06:14:08 UTC
r1768153
Comment 5 Mark Murphy 2016-11-05 06:14:11 UTC
Created attachment 34421 [details]
mylyn/context/zip