Bug 50972

Summary: XWPFWordExtractor ignores <w:br/> entries
Product: POI Reporter: Igor Rogov <igor.rogov.35>
Component: XWPFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: normal CC: igor.rogov.35
Priority: P2    
Version: 3.8-dev   
Target Milestone: ---   
Hardware: PC   
OS: All   
Attachments: Test document

Description Igor Rogov 2011-03-25 12:40:58 UTC
Created attachment 26797 [details]
Test document

Two words separated by a line break character are glued together.

I tried to debug the issue and found a code in XWPFRun.toString() method:

if (o instanceof CTEmpty) {
   // Some inline text elements get returned not as
   //  themselves, but as CTEmpty, owing to some odd
   //  definitions around line 5642 of the XSDs
   String tagName = o.getDomNode().getNodeName();
   if ("w:tab".equals(tagName)) {
      text.append("\t");
   }
   if ("w:br".equals(tagName)) {
      text.append("\n");
   }
   <...>
}

The issue is that "o" is an instance of CTBrImpl, not CTEmpty. So this element is ignored.

Attached a test document.
Comment 1 Nick Burch 2011-03-25 13:00:38 UTC
Ah, looks like someone fixed the code for one set of ooxml-schemas, but not the other

Fixed in r1085471.