Bug 62129

Summary: java.lang.ArrayIndexOutOfBoundsException when HWPFDocument read word
Product: POI Reporter: sven.zhang <dehuang_zhang>
Component: HWPFAssignee: POI Developers List <dev>
Status: RESOLVED WONTFIX    
Severity: normal    
Priority: P2    
Version: 3.17-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: All   
Attachments: word file and code

Description sven.zhang 2018-02-24 02:31:47 UTC
Created attachment 35740 [details]
word file and code

Word file generated by HTML,this word file can open by office2013

java.lang.ArrayIndexOutOfBoundsException: 94754
	at org.apache.poi.util.LittleEndian.getUShort(LittleEndian.java:327)
	at org.apache.poi.hwpf.model.FileInformationBlock.<init>(FileInformationBlock.java:113)
	at org.apache.poi.hwpf.HWPFDocumentCore.<init>(HWPFDocumentCore.java:167)
	at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:197)
	at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:181)
	at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:169)
Comment 1 Dominik Stadler 2018-04-02 21:06:46 UTC
This is actually some sort of HTML file that is prepended with some data to allows Word to open it. 

Apache POI does not have support for such files currently and there are no plans to add support anytime soon unless someone steps up and provides patches and proper unit-tests.

You should try to write the file in one of the official Office-formats which should be supported to some degree in recent versions of Apache POI (unfortunately support for Word-formats is not fully done yet...).