Bug 49446 - [patch] please don't insert field codes in the XWPFWordExtractor output
Summary: [patch] please don't insert field codes in the XWPFWordExtractor output
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: XWPF (show other bugs)
Version: 3.6-dev
Hardware: PC Windows XP
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-06-16 08:56 UTC by Antoni Mylka
Modified: 2010-06-29 09:39 UTC (History)
0 users



Attachments
a patch (2.32 KB, patch)
2010-06-16 08:56 UTC, Antoni Mylka
Details | Diff
A test case, to be placed in test-data/document (16.63 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2010-06-16 08:57 UTC, Antoni Mylka
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Antoni Mylka 2010-06-16 08:56:02 UTC
Created attachment 25597 [details]
a patch

The OpenXML specification defines in sec. 17.16.23 a w:instrText tag. It contains field codes which are usually uninteresting for consumers of the fulltext. In the XMLBeans model they show up as instances of CText.

I suggest that the XWPFParagraph.readNewText method should take this into account.

A patch is attached.
Comment 1 Antoni Mylka 2010-06-16 08:57:09 UTC
Created attachment 25598 [details]
A test case, to be placed in test-data/document

added a sample docx file with AUTHOR and CREATEDATE fields
Comment 2 Antoni Mylka 2010-06-24 04:57:34 UTC
Comment on attachment 25597 [details]
a patch

Enabled the "patch" checkbox on the poi-fieldcodes.patch attachment.
Comment 3 Nick Burch 2010-06-29 09:39:21 UTC
Thanks for the patch, applied in r958965.