Created attachment 31177 [details] Word document with 3 content controls When calling getText() the contents of the content controls is not returned when the content control is within a paragraph with other text. When the content control is the only item then the text is there. This appears to be the exact opposite of the behaviour in 3.9 where text in a content control where that is the only item in a paragraph doesn't appear though that in a paragraph with other text does. (That fix appears to have been in the onDocumentRead() method of org.apache.poi.xwpf.XWPFDocument). I've used the following test (and attached document to demonstrate the problem. public void test_manualDoc() throws FileNotFoundException, IOException { String filepath = "resources/contentcontrol.docx"; String expected = "Content control within a paragraph is here text content from within a paragraph second control with a new\nline\n\nContent control that is the entire paragraph"; XWPFDocument doc = new XWPFDocument(new FileInputStream(filepath)); XWPFWordExtractor extractedDoc = new XWPFWordExtractor(doc); String actual = extractedDoc.getText(); extractedDoc.close(); Assert.assertEquals(expected, actual); }
Fixed in r1875802 by including runs of type XWPFSDT during text-extraction.