This is a half step towards 60936. "On TIKA-2313, Steven Hall submitted an example Word 6.0 file whose extracted text is garbage." From what I can tell 6.0 didn't use Unicode. Until we can figure out how the codepage was specified in 6.0, we should at least turn off the Unicode check.
*** This bug has been marked as a duplicate of bug 50955 ***