Summary: | RuntimeException on extracting text from Word 97-2004 Document | ||
---|---|---|---|
Product: | POI | Reporter: | Jeremy B. Merrill <jeremy.merrill> |
Component: | HWPF | Assignee: | POI Developers List <dev> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | sergeymalafeev |
Priority: | P2 | ||
Version: | 3.12-dev | ||
Target Milestone: | --- | ||
Hardware: | PC | ||
OS: | All | ||
Attachments: | failing document |
Description
Jeremy B. Merrill
2015-04-21 15:18:57 UTC
POI detected this as a Word 95 or older file, requiring HWPFOldDocument to read the file. The file claims it is a Microsoft Word 6.0 Document, which is the file format of Word 6.0, released in 1993. [1] [1] https://en.wikipedia.org/wiki/Microsoft_Word#Release_history I got the same error as you in the latest version of POI, 3.16 trunk. I added this failing unit test in r1761873. > java.lang.ArrayIndexOutOfBoundsException > at java.lang.System.arraycopy(Native Method) > at org.apache.poi.hwpf.model.PAPFormattedDiskPage.getGrpprl(PAPFormattedDiskPage.java:171) > at org.apache.poi.hwpf.model.PAPFormattedDiskPage.<init>(PAPFormattedDiskPage.java:101) > at org.apache.poi.hwpf.model.OldPAPBinTable.<init>(OldPAPBinTable.java:49) > at org.apache.poi.hwpf.HWPFOldDocument.<init>(HWPFOldDocument.java:107) > at org.apache.poi.hwpf.HWPFOldDocument.<init>(HWPFOldDocument.java:45) > at org.apache.poi.hwpf.usermodel.TestBugs.test57843(TestBugs.java:911) There is a failing test for this at org.apache.poi.hwpf.usermodel.TestBugs.test57603SevenRowTable which was added via r1761873 |