Bug 62129 - java.lang.ArrayIndexOutOfBoundsException when HWPFDocument read word
Summary: java.lang.ArrayIndexOutOfBoundsException when HWPFDocument read word
Alias: None
Product: POI
Classification: Unclassified
Component: HWPF (show other bugs)
Version: 3.17-FINAL
Hardware: PC All
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2018-02-24 02:31 UTC by sven.zhang
Modified: 2018-04-02 21:06 UTC (History)
0 users

word file and code (3.38 KB, application/x-zip-compressed)
2018-02-24 02:31 UTC, sven.zhang

Note You need to log in before you can comment on or make changes to this bug.
Description sven.zhang 2018-02-24 02:31:47 UTC
Created attachment 35740 [details]
word file and code

Word file generated by HTML,this word file can open by office2013

java.lang.ArrayIndexOutOfBoundsException: 94754
	at org.apache.poi.util.LittleEndian.getUShort(LittleEndian.java:327)
	at org.apache.poi.hwpf.model.FileInformationBlock.<init>(FileInformationBlock.java:113)
	at org.apache.poi.hwpf.HWPFDocumentCore.<init>(HWPFDocumentCore.java:167)
	at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:197)
	at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:181)
	at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:169)
Comment 1 Dominik Stadler 2018-04-02 21:06:46 UTC
This is actually some sort of HTML file that is prepended with some data to allows Word to open it. 

Apache POI does not have support for such files currently and there are no plans to add support anytime soon unless someone steps up and provides patches and proper unit-tests.

You should try to write the file in one of the official Office-formats which should be supported to some degree in recent versions of Apache POI (unfortunately support for Word-formats is not fully done yet...).