Bug 56350

Summary: XWPFRun seperating text in the same format
Product: POI Reporter: Navaneeth Anantharaman <navaneethster>
Component: XWPFAssignee: POI Developers List <dev>
Status: RESOLVED INVALID    
Severity: major CC: navaneethster
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: All   
Attachments: the template word Document used in program.

Description Navaneeth Anantharaman 2014-04-04 20:01:47 UTC
Created attachment 31477 [details]
the template word Document used in program.

I am trying to execute the code below

var doc = new XWPFDocument(OPCPackage.open(("D:\\template.docx")));

for ( p in doc.getParagraphs()) {
  for ( r in p.getRuns()) {
    var text = r.toString();
    print(text)
    if (text.contains("$ClaimNumber")) {
      text = text.replace("$ClaimNumber", claimRes.ClaimNumber);
      r.setText(text,0);
    }
   if (text.contains("$NoOfExposures")) {
      text = text.replace("$NoOfExposures", claimRes.Exposures.length);
      r.setText(text,0);
    }
  }
}

try {
  outStream = new FileOutputStream("D:\\templateoutput.docx");
} catch (e : FileNotFoundException ) {
  e.printStackTrace();
}

try {
  doc.write(outStream);
  print("File updated successfully")
  outStream.close();
} catch (e : FileNotFoundException ) {
  e.printStackTrace();
} catch (e :IOException ) {
  e.printStackTrace();
}

The output is 

The Claim Number is $
ClaimNumber
There are $
NoOfExposures
 Exposures
File updated successfully

It can be clearly seen that the problem is with the XWPFRun which is reading it in a way that is not expected. I am trying to replace $ClaimNumber with the actual Claim Number.
Since the Run is behaving in an unexpected manner the $ and the word ClaimNumber is separated in to two runs which in turn fails to be replace in the output.
Also if you look at the document its pretty simple and straight forward and looking at it I can see that the entire Paragraph is in the Same format.
Which would make me assume that the Run is also in the same format. However looking at the output it is not so. Please let me know how to resolve this.

PS : 
The above code is a Gosu Code.
However its easily convertible to Java and the variable ClaimRes usages can also be replaced with any text
Comment 1 Nick Burch 2014-04-05 16:34:11 UTC
That's not a bug in Apache POI, that's just a quirk of how word writes the file. For some reason, Word has decided that the $ and the rest of your text have/had/may-have-had different formatting, so split them into two runs

You'll need to tweak your logic to detect and cope with the case when the search text spans several runs, then decide which run to replace in and which to delete