If the created date ends with an offset that contains colon on it (e.g.: "2016-08-17T07:13:06+00:00") Please refer to this StackOverflow post for more details, as the error described there is the same I've found. http://stackoverflow.com/questions/36522278/exceptionorg-apache-poi-openxml4j-exceptions-invalidformatexception-date-not-w
Can you check with a more recent version of POI? I think we fixed this in https://bz.apache.org/bugzilla/show_bug.cgi?id=59183. If we didn't, would you be able to supply a triggering file?
Created attachment 34188 [details] Core.xml with the timestamp with colon
Unfortunately the file contains sensitive data and I cannot upload it. But I've attached the core.xml extracted from the file with issues, if using a zip utility you replace the core.xml of a valid XLSX with this one, the issue can be reproduced. I've tested with version 3.13 (As suggested in the related bug you posted) and it works. I've, also, tested with version 3.15-beta2 (Last one I've seen in maven repo) and it still fails. Hope this help.
Yes. Thank you. I'll take a look in the next few days.
I replaced a docx file with your attachment, and all works with Tika trunk, which means POI-3.15-beta1. Let me know if you are still having problems with poi >= POI-3.15-beta1. Application-Name: Microsoft Office Word Application-Version: 12.0000 Content-Length: 8382 Content-Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document Creation-Date: 2016-08-17T07:13:06Z Last-Author: krajee Last-Modified: 2016-08-17T07:13:06Z Last-Save-Date: 2016-08-17T07:13:06Z Manager: MyManager Page-Count: 1 Template: Normal.dotm Total-Time: 1 X-Parsed-By: org.apache.tika.parser.DefaultParser X-Parsed-By: org.apache.tika.parser.microsoft.ooxml.OOXMLParser X-TIKA:digest:MD5: 4f7dd54a6bf8651c7b77e33e010924ce X-TIKA:digest:SHA256: a70c09b5b81d0da0feb86ffdc4af115665b00d5c46ea5078243b021d0f850b98 date: 2016-08-17T07:13:06Z dc:description: Grid export generated by Krajee ExportMenu widget (yii2-export) dc:publisher: Wygwam dcterms:created: 2016-08-17T07:13:06Z dcterms:modified: 2016-08-17T07:13:06Z description: Grid export generated by Krajee ExportMenu widget (yii2-export) extended-properties:AppVersion: 12.0000 extended-properties:Application: Microsoft Office Word extended-properties:Company: Wygwam extended-properties:Manager: MyManager extended-properties:Template: Normal.dotm extended-properties:TotalTime: 1 meta:creation-date: 2016-08-17T07:13:06Z meta:last-author: krajee meta:page-count: 1 meta:save-date: 2016-08-17T07:13:06Z modified: 2016-08-17T07:13:06Z publisher: Wygwam resourceName: tmp.docx xmpTPg:NPages: 1
Sorry, didn't read your followup closely enough. I upgraded a local copy of Tika to poi-3.15-beta2 and tested with an xlsx file. I'm not able to reproduce the problem: <meta name="date" content="2016-08-17T07:13:06Z" /> <meta name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser" /> <meta name="X-Parsed-By" content="org.apache.tika.parser.microsoft.ooxml.OOXMLParser" /> <meta name="dc:description" content="Grid export generated by Krajee ExportMenu widget (yii2-export)" /> <meta name="extended-properties:AppVersion" content="16.0300" /> <meta name="meta:creation-date" content="2016-08-17T07:13:06Z" /> <meta name="extended-properties:Application" content="Microsoft Excel" /> <meta name="meta:last-author" content="krajee" /> <meta name="extended-properties:Company" content="" /> <meta name="Creation-Date" content="2016-08-17T07:13:06Z" /> <meta name="description" content="Grid export generated by Krajee ExportMenu widget (yii2-export)" /> <meta name="dcterms:created" content="2016-08-17T07:13:06Z" /> <meta name="Last-Author" content="krajee" /> <meta name="Last-Modified" content="2016-08-17T07:13:06Z" /> <meta name="dcterms:modified" content="2016-08-17T07:13:06Z" /> <meta name="Last-Save-Date" content="2016-08-17T07:13:06Z" /> <meta name="Application-Version" content="16.0300" /> <meta name="protected" content="false" /> <meta name="meta:save-date" content="2016-08-17T07:13:06Z" /> <meta name="Application-Name" content="Microsoft Excel" /> <meta name="modified" content="2016-08-17T07:13:06Z" /> <meta name="publisher" content="" /> <meta name="Content-Type" content="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet" /> <meta name="dc:publisher" content="" /> <title></title> </head> <body><div><h1>Sheet1</h1> <table><tbody /></table> </div> </body></html>
Created attachment 34225 [details] file with core.xml swapped out with example posted in earlier attachment I'm attaching the test file that I used.
I have re-tested the original file with version 3.15-beta2, but this time launching the tests using command-line instead of intellij IDE which seems not to be aware of the change of library version, and with version 3.15-beta2 works as expected. We will wait until release version 3.15 is out to upgrade. Thanks for the help, and sorry for the inconveniences.
> Thanks for the help, and sorry for the inconveniences. No problem at all. Thank you for opening this issue. Please let us know what else you find!