Bug 22014 - wordDocument.writeAllText() return null
Summary: wordDocument.writeAllText() return null
Status: RESOLVED WONTFIX
Alias: None
Product: POI
Classification: Unclassified
Component: HDF (show other bugs)
Version: 2.0-pre3
Hardware: PC All
: P3 major (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2003-07-31 09:42 UTC by rastin
Modified: 2005-03-20 17:06 UTC (History)
0 users



Attachments
this is the word document that will return null when parsed by wordDocument.writeAllText() (4.00 KB, application/msword)
2003-08-01 00:32 UTC, rastin
Details
Contains program that would make this problem show up. Unzip and run java -jar Bug22014Replicator.jar <path-to-word-document> (773.42 KB, application/zip)
2003-08-01 01:13 UTC, rastin
Details

Note You need to log in before you can comment on or make changes to this bug.
Description rastin 2003-07-31 09:42:13 UTC
wordDocument.writeAllText() return null when a ms word doc file contains many 
tabs and indent. This is experienced whenever the the WordDocument of this kind 
is being process. Please give me an email address or a url where i could upload 
the doc file.
Comment 1 rastin 2003-07-31 09:49:23 UTC
Bellow is a method that could be use to replicate the condition


public String convertToText(String theWordDoc) throws java.lang.Exception {
        StringWriter out=null;
        String result=null;
        
        try {
            wordDocument=new WordDocument(theWordDoc);
            out=new StringWriter();
            wordDocument.writeAllText(out);
            out.flush();
            result=out.getBuffer().toString();

        }
        finally {
            if(out!=null) {
                out.close();
            }
        }
        return result; // null is being returned
    }


I could send the word doc as well. Please let me know id required
Comment 2 rastin 2003-08-01 00:32:20 UTC
Created attachment 7610 [details]
this is the word document that will return null when parsed by wordDocument.writeAllText()
Comment 3 rastin 2003-08-01 01:13:16 UTC
Created attachment 7611 [details]
Contains program that would make this problem show up. Unzip and run java -jar Bug22014Replicator.jar <path-to-word-document>
Comment 4 rastin 2003-08-01 01:18:20 UTC
When running the program (java -jar Bug22014Replicator.jar <path-to-word-
document>), with the word document attached, I get the following:

Exception in thread "main" java.io.IOException: Invalid header signature; read 2
90763650945099227, expected -2226271756974174256
        at org.apache.poi.poifs.storage.HeaderBlockReader.<init>(HeaderBlockRead
er.java:124)
        at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSyste
m.java:120)
        at org.apache.poi.hdf.extractor.WordDocument.<init>(WordDocument.java:22
9)
        at org.apache.poi.hdf.extractor.WordDocument.<init>(WordDocument.java:22
2)
        at bug22014.ReplicateBug22014.replicate(ReplicateBug22014.java:39)
        at bug22014.ReplicateBug22014.main(ReplicateBug22014.java:30)


But the word document is viewable using Word Viewer, Microsoft Word and Open 
Office.  
Comment 5 Ryan Ackley 2003-08-01 14:11:24 UTC
This document is from Word 2.0. Next time you have this problem go into Word 
and try to Save As. The version of the format will show up in the "Save as 
type" field. We don't support Word 2.0 and we have no plans to support Word 
2.0...Sorry