Summary: | Word document with a single table gets corrupted after load/save with no changes | ||
---|---|---|---|
Product: | POI | Reporter: | Kostiantyn Miklevskyi <kostiantyn.miklevskyi> |
Component: | HWPF | Assignee: | POI Developers List <dev> |
Status: | RESOLVED DUPLICATE | ||
Severity: | major | ||
Priority: | P2 | ||
Version: | 3.15-FINAL | ||
Target Milestone: | --- | ||
Hardware: | PC | ||
OS: | All | ||
Attachments: |
Maven project with document corruption example
output .doc file after running unit test Screenshot of Word error message when opening a corrupted file LibreOffice 5.2.2.2 original file and corrupted file side-to-side |
Can POI read the document after load/save? Created attachment 34340 [details] output .doc file after running unit test Using the DocumentWithOneTable.doc from your attachment, the unit test below creates the attached file. LibreOffice does not complain about this file. Can you check if Word reports that the attached file is corrupted? Added to TestHPSFBugs.java: public void test60217() throws Exception { InputStream fis = new FileInputStream("/tmp/bug60217.doc"); POIDocument doc = new HWPFDocument(fis); fis.close(); doc.write(new File("/tmp/bug60217-out.doc")); doc.close(); } >Mark Murphy 2016-10-07 19:23:58 UTC
>Can POI read the document after load/save?
No, it throws an exception.
Should've provided this info in initial report as I actually tried it.
Here's a code:
final POIDocument doc = new HWPFDocument(SaveToAnotherDocumentBug.class.getClassLoader().getResourceAsStream(DOCUMENT_NAME));
final File copy = new File(CORRUPTED_PREFIX + "-" + DOCUMENT_NAME);
doc.write(copy);
doc.close();
new HWPFDocument(new FileInputStream(copy));
And it throws with this stacktrace:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: -1845343745
at org.apache.poi.util.LittleEndian.getUByte(LittleEndian.java:274)
at org.apache.poi.hwpf.model.FormattedDiskPage.<init>(FormattedDiskPage.java:61)
at org.apache.poi.hwpf.model.PAPFormattedDiskPage.<init>(PAPFormattedDiskPage.java:85)
at org.apache.poi.hwpf.model.PAPBinTable.<init>(PAPBinTable.java:75)
at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:226)
at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:157)
at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:145)
at com.cosi.SaveToAnotherDocumentBug.main(SaveToAnotherDocumentBug.java:20)
>Javen O'Neal 2016-10-08 22:16:16 UTC
>Can you check if Word reports that the attached file is corrupted?
Yes. The same error message that Word reported previously.
Attaching a screenshot.
Created attachment 34350 [details]
Screenshot of Word error message when opening a corrupted file
Created attachment 34351 [details]
LibreOffice 5.2.2.2 original file and corrupted file side-to-side
Downloaded latest stable LibreOffice version 5.2.2.2 and it indeed doesn't complain about the corruption but, so I opened original document and a corrupted one to show the difference.
This looks quite similar to bug #60097, so I am closing this one as duplicate to have one place to continue discussion. *** This bug has been marked as a duplicate of bug 60097 *** |
Created attachment 34333 [details] Maven project with document corruption example Attaching a sample with a Word document that gets corrupted when we open it and save it to another file with a code like: final POIDocument doc = new HWPFDocument(new FileInputStream(DOCUMENT_NAME)); final File copy = new File(CORRUPTED_PREFIX + "-" + DOCUMENT_NAME); doc.write(copy); When trying to open source document it will open ok. When trying to open the document after load/save Microsoft Word reports that it is corrupted and cannot be recovered.