Created attachment 35961 [details] test file for crash apache poi open An error occurs while opening the file: org.apache.poi.POIXMLException: java.lang.reflect.InvocationTargetException at org.apache.poi.POIXMLFactory.createDocumentPart(POIXMLFactory.java:63) at org.apache.poi.POIXMLDocumentPart.read(POIXMLDocumentPart.java:580) at org.apache.poi.POIXMLDocumentPart.read(POIXMLDocumentPart.java:592) at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:165) at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:270) at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:266) at ru.misterparser.acoola.Parser.processFile(Parser.java:121) at ru.misterparser.acoola.Parser.run(Parser.java:102) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.poi.xssf.usermodel.XSSFFactory.createDocumentPart(XSSFFactory.java:56) at org.apache.poi.POIXMLFactory.createDocumentPart(POIXMLFactory.java:60) ... 7 more Caused by: org.apache.xmlbeans.XmlException: Attribute "relid" bound to namespace "urn:schemas-microsoft-com:office:office" was already specified for element "v:fill". at org.apache.poi.xssf.usermodel.XSSFVMLDrawing.read(XSSFVMLDrawing.java:134) at org.apache.poi.xssf.usermodel.XSSFVMLDrawing.<init>(XSSFVMLDrawing.java:120) ... 13 more Caused by: org.xml.sax.SAXParseException; lineNumber: 14; columnNumber: 35; Attribute "relid" bound to namespace "urn:schemas-microsoft-com:office:office" was already specified for element "v:fill". at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:203) at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177) at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:400) at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:327) at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:284) at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:322) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2784) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602) at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:505) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:841) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:770) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141) at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:243) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:339) at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121) at org.apache.poi.util.DocumentHelper.readDocument(DocumentHelper.java:140) at org.apache.poi.xssf.usermodel.XSSFVMLDrawing.read(XSSFVMLDrawing.java:132) ... 14 more
Comment on attachment 35961 [details] test file for crash apache poi open What produced that test xlsx file? The content of the xl/drawings/vmlDrawing1.vml is causing an XML parsing issue in Apache Xerces <v:fill o:relid="rId1" o:relid="rId1" o:title="Comment_20106100028___Kaminer___темно-голубой" color2="#ffffe1" type="frame"/> Not the duplicate o:relid="rId1" attributes. Maybe there is a way to set some system properties that gets Xerces to ignore the duplicate attributes.
This file was sent by the supplier of children's clothing https://acoolakids.ru/ I'll find out from them how they create such a file. What kind of workaround can I find? Now, I just ask users of my software through Excel to re-save the file. Excel fixes this error.
The error is coming from the Java built-in version of the Apache Xerces XML parser. You can replace this (cf https://docs.oracle.com/javase/7/docs/api/javax/xml/parsers/DocumentBuilderFactory.html). You can experiment with different XML parsers and settings for them. I am not personally aware of any setup that workaround this issue. The XML in your spreadsheet is badly formed.
Prompt, please, the actual working Java XML Parsers, except for Apache Xerces J
As far as I see this is not a problem in POI itself, but rather caused by invalid XML content in the document that you try to parse. You have a few options: * get the supplier to fix the file format to be standard-conforming * pre-process the file prior to processing with POI so that the invalid XML is fixed by you before you pass it to POI * use a different XML Parser (this is probably not easily done as it might cause other issues as well and not sure if other implementations will allow to parse this. Any fully compliant XML Parser should probably reject the XML with an error) As far as I see there is not much we can do here on our side.