Bug 63576

Summary: WordExtractor - capitalized text
Product: POI Reporter: Franz Seidl <website>
Component: HWPFAssignee: POI Developers List <dev>
Status: NEW ---    
Severity: normal CC: website
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Attachments: Example

Description Franz Seidl 2019-07-21 10:02:04 UTC
Created attachment 36671 [details]

WordExtractor doesn't respect text which is formatted capitalized.

See attached example:
  - WordTextExtractorDoc.java: test program
  - capitalized.doc: test file
  - capitalized.txt: "text only" version saved with Word

I expect the text: "The following word is: CAPITALIZED."
Instead I get: "The following word is: capitalized."
Comment 1 Franz Seidl 2019-07-21 10:04:08 UTC
Similar to bug Bug 63575