Bug 60942 - Avoid unicode check in Word 6.0 docs
Summary: Avoid unicode check in Word 6.0 docs
Status: RESOLVED DUPLICATE of bug 50955
Alias: None
Product: POI
Classification: Unclassified
Component: HWPF (show other bugs)
Version: 3.16-dev
Hardware: PC All
: P2 enhancement (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2017-03-30 16:11 UTC by Tim Allison
Modified: 2017-03-31 18:37 UTC (History)
0 users


Note You need to log in before you can comment on or make changes to this bug.
Description Tim Allison 2017-03-30 16:11:06 UTC
This is a half step towards 60936.  

"On TIKA-2313, Steven Hall submitted an example Word 6.0 file whose extracted text is garbage."

From what I can tell 6.0 didn't use Unicode.  Until we can figure out how the codepage was specified in 6.0, we should at least turn off the Unicode check.
Comment 1 Tim Allison 2017-03-31 18:37:44 UTC

*** This bug has been marked as a duplicate of bug 50955 ***