Bug 39526

Summary: ArrayIndexOutOfBoundsException opening Word document
Product: POI Reporter: Trejkaz (pen name) <trejkaz>
Component: HWPFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: major CC: vova455
Priority: P1    
Version: 3.7-FINAL   
Target Milestone: ---   
Hardware: Other   
OS: other   

Description Trejkaz (pen name) 2006-05-09 05:41:56 UTC
There may be some issue with opening the ListLevel structures.  Unfortunately we
can't give you a test document for this one.  Needless to say it eliminates the
ability to read any text from the document.  It would be good if at the very
least, the things which are valid would still be read out.

java.lang.ArrayIndexOutOfBoundsException: 36251
        at org.apache.poi.util.LittleEndian.getNumber(LittleEndian.java:491)
        at org.apache.poi.util.LittleEndian.getShort(LittleEndian.java:52)
        at org.apache.poi.hwpf.model.ListLevel.<init>(ListLevel.java:123)
        at
org.apache.poi.hwpf.model.ListFormatOverrideLevel.<init>(ListFormatOverrideLevel.java:49)
        at org.apache.poi.hwpf.model.ListTables.<init>(ListTables.java:85)
        at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:185)
Comment 1 Jason OConnell 2009-01-15 02:42:15 UTC
Same situation for me with 3.5 beta 4

Caused by: java.lang.ArrayIndexOutOfBoundsException: 593194
        at org.apache.poi.util.LittleEndian.getShort(LittleEndian.java:45)
        at org.apache.poi.hwpf.model.ListLevel.<init>(ListLevel.java:120)
        at org.apache.poi.hwpf.model.ListFormatOverrideLevel.<init>(ListFormatOverrideLevel.java:50)
        at org.apache.poi.hwpf.model.ListTables.<init>(ListTables.java:89)
        at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:269)
        at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:158)
        at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:146)
Comment 2 Maxim Valyanskiy 2009-07-27 05:55:32 UTC
I have the say problem on trunk (july 2009):

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 130936
	at org.apache.poi.util.LittleEndian.getShort(LittleEndian.java:46)
	at org.apache.poi.hwpf.model.ListLevel.<init>(ListLevel.java:120)
	at org.apache.poi.hwpf.model.ListFormatOverrideLevel.<init>(ListFormatOverrideLevel.java:48)
	at org.apache.poi.hwpf.model.ListTables.<init>(ListTables.java:88)
	at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:268)
	at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:157)
	at org.apache.poi.hwpf.extractor.WordExtractor.<init>(WordExtractor.java:62)
	at org.apache.poi.hwpf.extractor.WordExtractor.<init>(WordExtractor.java:54)
	at org.apache.poi.hwpf.extractor.WordExtractor.main(WordExtractor.java:92)


Unfortunately I can't give you a test document
Comment 3 Nick Burch 2011-02-25 17:01:16 UTC
Does this problem still remain with a recent svn nightly build / poi 3.8 beta 1 (when released shortly...)?
Comment 4 Trejkaz (pen name) 2011-04-05 00:50:33 UTC
Let's assume that it is.  We don't have a copy of a file which exhibits the issue and I can't get any sample data from the source either.
Comment 5 Maxim Valyanskiy 2011-04-11 09:13:38 UTC
I still have document that raises the same exception. Unfortunatelly, it has clear statement about restricted distribution and I can't attach it to bug report.
Comment 6 Trejkaz (pen name) 2011-04-12 06:22:20 UTC
Fair enough.  Reopening, as I'm sure it will affect us eventually too.
Comment 7 Vladimir Pevunov 2011-06-02 14:15:16 UTC
I have document which raises the same exception. This document available at <a href="http://easyimpress.com/files/english.doc"/>
Comment 8 Yegor Kozlov 2011-06-24 08:11:04 UTC
Confirmed that we still have the problem in trunk (as of r1138799). 

Yegor

(In reply to comment #7)
> I have document which raises the same exception. This document available at <a
> href="http://easyimpress.com/files/english.doc"/>
Comment 9 Sergey Vladimirov 2011-10-30 00:37:54 UTC
File not found at specified location.
Comment 10 Vladimir Pevunov 2011-11-03 17:00:57 UTC
(In reply to comment #9)
> File not found at specified location.

Sorry for that. Now that document available at http://67.23.29.23/english.doc

Thank you,
Vladimir
Comment 11 Sergey Vladimirov 2012-11-05 10:48:03 UTC
Seems to be fixed as part of 53380
Comment 12 Mr.zhang 2020-06-17 10:31:03 UTC
I have a same problem. when i read a document of doc.
this my poi dependency.
<dependency>
	<groupId>org.apache.poi</groupId>
	<artifactId>poi</artifactId>
	<version>4.1.2</version>
</dependency>
<dependency>
	<groupId>org.apache.poi</groupId>
	<artifactId>poi-ooxml</artifactId>
	<version>4.1.2</version>
</dependency>
<dependency>
	<groupId>org.apache.poi</groupId>
	<artifactId>poi-ooxml-schemas</artifactId>
	<version>4.1.2</version>
</dependency>
<dependency>
	<groupId>org.apache.poi</groupId>
	<artifactId>poi-scratchpad</artifactId>
	<version>4.1.2</version>
</dependency>

my code
InputStream is = new FileInputStream(path);
HWPFDocument doc = new HWPFDocument(is);
StringBuilder buffer = doc.getText();

the error
java.lang.ArrayIndexOutOfBoundsException: Index 65946 out of bounds for length 9355

	at org.apache.poi.util.LittleEndian.getUShort(LittleEndian.java:355)
	at org.apache.poi.hwpf.model.FileInformationBlock.<init>(FileInformationBlock.java:118)
	at org.apache.poi.hwpf.HWPFDocumentCore.<init>(HWPFDocumentCore.java:170)