Bug 63575 - XWPFWordExtractor - capitalized text (<w:caps/>)
Summary: XWPFWordExtractor - capitalized text (<w:caps/>)
Status: NEW
Alias: None
Product: POI
Classification: Unclassified
Component: XWPF (show other bugs)
Version: unspecified
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2019-07-21 09:58 UTC by Franz Seidl
Modified: 2019-07-21 10:08 UTC (History)
1 user (show)

Example (9.83 KB, application/zip)
2019-07-21 09:58 UTC, Franz Seidl

Note You need to log in before you can comment on or make changes to this bug.
Description Franz Seidl 2019-07-21 09:58:32 UTC
Created attachment 36670 [details]

XWPFWordExtractor doesn't respect text which is formatted capitalized (<w:caps/>).

See attached example:
  - WordTextExtractorDocx.java: test program
  - capitalized.docx: test file
  - capitalized.txt: "text only" version saved with Word

I expect the text: "The following word is: CAPITALIZED."
Instead I get: "The following word is: capitalized."
Comment 1 Franz Seidl 2019-07-21 10:04:33 UTC
Similar to bug Bug 63576