Bug 21725 - Error Reading word document
Summary: Error Reading word document
Alias: None
Product: POI
Classification: Unclassified
Component: POI Overall (show other bugs)
Version: unspecified
Hardware: Sun Solaris
: P1 critical (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2003-07-19 04:59 UTC by Kesavan Janakiraman
Modified: 2004-11-16 19:05 UTC (History)
1 user (show)

MS Word Document to be tested (765.50 KB, application/ms-word)
2003-07-19 05:05 UTC, Kesavan Janakiraman
MS Word Document. Ignore the previous one. (305.00 KB, application/ms-word)
2003-07-19 23:55 UTC, Kesavan Janakiraman

Note You need to log in before you can comment on or make changes to this bug.
Description Kesavan Janakiraman 2003-07-19 04:59:40 UTC

I am getting ArrayIndexOutOfBoundException while trying to convert word 
document into text. I am able to convert most of the documents. But only for 
some particular documents i am not able to do. Please help me to solve this 
problem. This is very much high priority issue.

The exception is occured at the following piece of code.

WordDocument wd;
wd = new WordDocument(origFileName);


The output obtained is as follows....

Title: ""
Error reading document:/usr/tmp/interface_10415.doc

Thanks in Advance
Comment 1 Kesavan Janakiraman 2003-07-19 05:05:25 UTC
Created attachment 7384 [details]
MS Word Document to be tested
Comment 2 Ryan Ackley 2003-07-19 10:42:46 UTC
We are in the middle of a complete rewrite of HDF (and name change to HWPF). 
If all you want is to extract text from a Word 97/2000/XP document, there is a 
library at http://www.textmining.org
Comment 3 Kesavan Janakiraman 2003-07-19 23:55:51 UTC
Created attachment 7401 [details]
MS Word Document. Ignore the previous one.
Comment 4 Kesavan Janakiraman 2003-07-21 17:55:32 UTC
Ok. I will check that and see.

Comment 5 Andy Oliver 2003-07-24 17:22:43 UTC
so resolve the bug ryan...