Bug 60626 - ArrayIndexOutOfBoundsException in EvilUnclosedBRFixingInputStream
Summary: ArrayIndexOutOfBoundsException in EvilUnclosedBRFixingInputStream
Alias: None
Product: POI
Classification: Unclassified
Component: XSSF (show other bugs)
Version: 3.16-dev
Hardware: All All
: P2 blocker (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2017-01-23 08:32 UTC by Joachim Piketz
Modified: 2017-02-10 07:32 UTC (History)
0 users

VML file that causes the problem (28.27 KB, application/xhtml+xml)
2017-01-23 08:32 UTC, Joachim Piketz
Alternative implementation of EvilUnclosedBRFixingInputStream (6.49 KB, text/plain)
2017-01-23 08:36 UTC, Joachim Piketz

Note You need to log in before you can comment on or make changes to this bug.
Description Joachim Piketz 2017-01-23 08:32:46 UTC
Created attachment 34663 [details]
VML file that causes the problem

I have an Excel file that can't be loaded. I found that EvilUnclosedBRFixingInputStream has a problem with a VML file with was part of my Excel file. 

The following sample code reproduces the problem:

String xmlFile = "vmlDrawing3.vml";
byte[] data = Files.readAllBytes(Paths.get(xmlFile));
ByteArrayInputStream bis = new ByteArrayInputStream(data);
EvilUnclosedBRFixingInputStream is = new EvilUnclosedBRFixingInputStream(bis);

The following Exception is thrown, however not in all Operating Systems/JDK Versions:

   Caused by: java.lang.ArrayIndexOutOfBoundsException: 2048
     at org.apache.xerces.impl.io.UTF8Reader.read(UTF8Reader.java:336)
     at org.apache.xerces.impl.XMLEntityScanner.load(XMLEntityScanner.java:1753)
     at org.apache.xerces.impl.XMLEntityScanner.scanLiteral(XMLEntityScanner.java:834)
     at org.apache.xerces.impl.XMLScanner.scanAttributeValue(XMLScanner.java:772)
     at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanAttribute(XMLNSDocumentScannerImpl.java:529)
     at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:181)
     at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(XMLDocumentFragmentScannerImpl.java:1653)
     at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:324)
     at org.apache.xerces.parsers.XML11Configuration.parse(XML11Configuration.java:875)
     at org.apache.xerces.parsers.XML11Configuration.parse(XML11Configuration.java:798)
     at org.apache.xerces.parsers.XMLParser.parse(XMLParser.java:108)
     at org.apache.xerces.parsers.DOMParser.parse(DOMParser.java:230)
     at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:298)
     at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121)
     at org.apache.poi.util.DocumentHelper.readDocument(DocumentHelper.java:137)
Comment 1 Joachim Piketz 2017-01-23 08:34:52 UTC
I used a FilterInputStream to test if it works with another EvilUnclosedBRFixingInputStream implementation, and it did. I have attached the file, but not tested it any further.
Comment 2 Joachim Piketz 2017-01-23 08:36:02 UTC
Created attachment 34664 [details]
Alternative implementation of EvilUnclosedBRFixingInputStream
Comment 3 Andreas Beeker 2017-01-31 00:37:03 UTC
Not sure who copied from whom, but if this wasn't yours first, it would be nice if you've posted the source:



Btw. I'm facing exactly the same with some user input and from the looks of it, the new implementation looks cleaner ...
Comment 4 Andreas Beeker 2017-02-08 01:26:50 UTC
The old implementation was somehow affected by the amount of bytes which were  read ... although there was a test for different buffer sizes.

I've updated the license references - but it should be ok, because the MIT license is compatible [1]

fixed with r1782095 

[1] https://www.apache.org/legal/resolved#category-a