Bug 50616

Summary: WordDocument.writeAllText returns incomplete result without throwing exception
Product: POI Reporter: Peter Drozda <peter.drozda>
Component: HDFAssignee: POI Developers List <dev>
Severity: normal    
Priority: P2    
Version: 3.7-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: Windows XP   
Attachments: MS Word document on which the effect can be reproduced.

Description Peter Drozda 2011-01-19 06:26:27 UTC
Created attachment 26514 [details]
MS Word document on which the effect can be reproduced.

When MS Word document (please see the attachment) containing greek characters is passed to org.apache.poi.hdf.extractor.WordDocument. Method writeAllText returns incorrect-incomplete result. No exception is thrown to indicate the problem.

Steps to reproduce:

1. Use the MS Word document from attachment.
2. Create the input stream of the document and then use this snippet:

            WordDocument wd = new WordDocument(inputStream);
            StringWriter docTextWriter = new StringWriter();
            PrintWriter pw = new PrintWriter(docTextWriter);
            result = docTextWriter.toString();

3. Expected result is string containing "Process description document τεστ new"
4. Actual result is "Process description"
5. No sign of internal error indicated, no exception is thrown.

I would expect at least exception thrown as an indicator that something went wrong.
Comment 1 Nick Burch 2011-01-19 06:43:10 UTC
HDF is no longer supported, and only remains for existing legacy users. Please try with HWPF