Bug 60217 - Word document with a single table gets corrupted after load/save with no changes
Summary: Word document with a single table gets corrupted after load/save with no changes
Status: RESOLVED DUPLICATE of bug 60097
Alias: None
Product: POI
Classification: Unclassified
Component: HWPF (show other bugs)
Version: 3.15-FINAL
Hardware: PC All
: P2 major (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-10-07 14:11 UTC by Kostiantyn Miklevskyi
Modified: 2019-08-29 18:04 UTC (History)
0 users



Attachments
Maven project with document corruption example (9.42 KB, application/x-zip-compressed)
2016-10-07 14:11 UTC, Kostiantyn Miklevskyi
Details
output .doc file after running unit test (22.00 KB, application/msword)
2016-10-08 22:16 UTC, Javen O'Neal
Details
Screenshot of Word error message when opening a corrupted file (6.66 KB, image/png)
2016-10-10 06:42 UTC, Kostiantyn Miklevskyi
Details
LibreOffice 5.2.2.2 original file and corrupted file side-to-side (55.40 KB, image/png)
2016-10-10 07:05 UTC, Kostiantyn Miklevskyi
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Kostiantyn Miklevskyi 2016-10-07 14:11:55 UTC
Created attachment 34333 [details]
Maven project with document corruption example

Attaching a sample with a Word document that gets corrupted when we open it and save it to another file with a code like:

final POIDocument doc = new HWPFDocument(new FileInputStream(DOCUMENT_NAME));
final File copy = new File(CORRUPTED_PREFIX + "-" + DOCUMENT_NAME);
doc.write(copy);

When trying to open source document it will open ok.
When trying to open the document after load/save Microsoft Word reports that it is corrupted and cannot be recovered.
Comment 1 Mark Murphy 2016-10-07 19:23:58 UTC
Can POI read the document after load/save?
Comment 2 Javen O'Neal 2016-10-08 22:16:16 UTC
Created attachment 34340 [details]
output .doc file after running unit test

Using the DocumentWithOneTable.doc from your attachment, the unit test below creates the attached file. LibreOffice does not complain about this file. Can you check if Word reports that the attached file is corrupted?

Added to TestHPSFBugs.java:
public void test60217() throws Exception {
    InputStream fis = new FileInputStream("/tmp/bug60217.doc");
    POIDocument doc = new HWPFDocument(fis);
    fis.close();
    doc.write(new File("/tmp/bug60217-out.doc"));
    doc.close();
}
Comment 3 Kostiantyn Miklevskyi 2016-10-10 06:36:28 UTC
>Mark Murphy 2016-10-07 19:23:58 UTC
>Can POI read the document after load/save?

No, it throws an exception.
Should've provided this info in initial report as I actually tried it.

Here's a code:

        final POIDocument doc = new HWPFDocument(SaveToAnotherDocumentBug.class.getClassLoader().getResourceAsStream(DOCUMENT_NAME));
        final File copy = new File(CORRUPTED_PREFIX + "-" + DOCUMENT_NAME);
        doc.write(copy);
        doc.close();

        new HWPFDocument(new FileInputStream(copy));

And it throws with this stacktrace:

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: -1845343745
	at org.apache.poi.util.LittleEndian.getUByte(LittleEndian.java:274)
	at org.apache.poi.hwpf.model.FormattedDiskPage.<init>(FormattedDiskPage.java:61)
	at org.apache.poi.hwpf.model.PAPFormattedDiskPage.<init>(PAPFormattedDiskPage.java:85)
	at org.apache.poi.hwpf.model.PAPBinTable.<init>(PAPBinTable.java:75)
	at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:226)
	at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:157)
	at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:145)
	at com.cosi.SaveToAnotherDocumentBug.main(SaveToAnotherDocumentBug.java:20)
Comment 4 Kostiantyn Miklevskyi 2016-10-10 06:41:24 UTC
>Javen O'Neal 2016-10-08 22:16:16 UTC
>Can you check if Word reports that the attached file is corrupted?

Yes. The same error message that Word reported previously.
Attaching a screenshot.
Comment 5 Kostiantyn Miklevskyi 2016-10-10 06:42:33 UTC
Created attachment 34350 [details]
Screenshot of Word error message when opening a corrupted file
Comment 6 Kostiantyn Miklevskyi 2016-10-10 07:05:36 UTC
Created attachment 34351 [details]
LibreOffice 5.2.2.2 original file and corrupted file side-to-side

Downloaded latest stable LibreOffice version 5.2.2.2 and it indeed doesn't complain about the corruption but, so I opened original document and a corrupted one to show the difference.
Comment 7 Dominik Stadler 2019-08-29 18:04:02 UTC
This looks quite similar to bug #60097, so I am closing this one as duplicate to have one place to continue discussion.

*** This bug has been marked as a duplicate of bug 60097 ***