Bug 51834 - Opening and Writing .doc file results in corrupt document
Summary: Opening and Writing .doc file results in corrupt document
Status: REOPENED
Alias: None
Product: POI
Classification: Unclassified
Component: HWPF (show other bugs)
Version: 3.8-dev
Hardware: PC Windows XP
: P2 major with 1 vote (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-09-16 11:47 UTC by Gilbert
Modified: 2014-03-11 11:54 UTC (History)
2 users (show)



Attachments
Opening and re-writing this file corrupts the output (46.00 KB, application/msword)
2011-09-16 11:47 UTC, Gilbert
Details
Result doc (correct one) (48.50 KB, application/octet-stream)
2011-10-02 01:09 UTC, Sergey Vladimirov
Details
Validation result (96 bytes, application/xml)
2011-10-02 01:09 UTC, Sergey Vladimirov
Details
Opening and re-writing this file corrupts the output (26.50 KB, application/msword)
2011-12-26 16:23 UTC, poi.dev.art
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Gilbert 2011-09-16 11:47:25 UTC
Created attachment 27508 [details]
Opening and re-writing this file corrupts the output

This code run against the attached document results in a corrupt word document that crashes MSWord 2003 and 2007 refuses to open.

	private void start() throws FileNotFoundException, IOException {

        POIFSFileSystem fsfilesystem = null;
        HWPFDocument hwpfdoc = null;
        
        InputStream resourceAsStream =  getClass().getResourceAsStream("/com/blackbox/admin/templates/rma.doc");       
        try {
			fsfilesystem = new POIFSFileSystem(resourceAsStream );
			hwpfdoc = new HWPFDocument(fsfilesystem);
			
			FileOutputStream fos = new FileOutputStream(new File("C:\\temp\\newTemplate.doc"));
			hwpfdoc.write(fos);
			fos.flush();
			fos.close();
			
		} catch (FileNotFoundException e) {
			e.printStackTrace();
		} catch (IOException e) {
			e.printStackTrace();
		}
        
		System.out.println("Opened");

}
Comment 1 Sergey Vladimirov 2011-10-02 01:08:32 UTC
Please, check latest code from trunk and attachment with saved document. It is passed Microsoft BFFValidator.

Several bugs were fixed:
 - summary properties handling
 - extended FIB handling
 - lists handling
Comment 2 Sergey Vladimirov 2011-10-02 01:09:18 UTC
Created attachment 27667 [details]
Result doc (correct one)
Comment 3 Sergey Vladimirov 2011-10-02 01:09:48 UTC
Created attachment 27668 [details]
Validation result
Comment 4 poi.dev.art 2011-12-26 16:23:03 UTC
Created attachment 28099 [details]
Opening and re-writing this file corrupts the output

Table cells seems to be problematic.

Tested : 
Merging any cells (using WORD 2007) from the input document before re-writing it makes the output clean.
Removing the table produces a clean output too
Comment 5 poi.dev.art 2011-12-26 16:25:06 UTC
Reopening for 3.8-beta5 : See previous comment
Comment 6 melanie.reiter 2014-03-11 11:54:10 UTC
This bug still exists in Version 3.10 final.

The following Situation occured:

My Word Document contains a table and I want to replace some text in a cell.
This works fine and I can open the file with Word 2010, but not with Word 2003 (It is a doc file).

There are three cases after replacing the text:

1. same length of the text: no problem, it is possible to open the file in Word 2003

2. old one is longer than replacement: open and repair is possible with Word 2003

3. old one is shorter than replacement: Word 2003 crashes

It is possible to open all documents with Word 2010.

Another test was to replace a text that is contained in an enumeration, but not in a table and it has got the same behavior.