Created attachment 32234 [details] Error Excel file Using Tika to convert the attachment file raised exception below. This file is opened normally by MS Office. =================================================================== Apache Tika was unable to parse the document The full exception stack trace is included below: org.apache.tika.exception.TikaException: TIKA-198: Illegal IOException from org.apache.tika.parser.microsoft.OfficeParser@47098a at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:253) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:247) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) at org.apache.tika.gui.TikaGUI.handleStream(TikaGUI.java:342) at org.apache.tika.gui.TikaGUI.openFile(TikaGUI.java:299) at org.apache.tika.gui.TikaGUI.actionPerformed(TikaGUI.java:256) at javax.swing.AbstractButton.fireActionPerformed(AbstractButton.java:1995) at javax.swing.AbstractButton$Handler.actionPerformed(AbstractButton.java:2318) at javax.swing.DefaultButtonModel.fireActionPerformed(DefaultButtonModel.java:387) at javax.swing.DefaultButtonModel.setPressed(DefaultButtonModel.java:242) at javax.swing.AbstractButton.doClick(AbstractButton.java:357) at javax.swing.plaf.basic.BasicMenuItemUI.doClick(BasicMenuItemUI.java:809) at javax.swing.plaf.basic.BasicMenuItemUI$Handler.mouseReleased(BasicMenuItemUI.java:850) at java.awt.Component.processMouseEvent(Component.java:6297) at javax.swing.JComponent.processMouseEvent(JComponent.java:3275) at java.awt.Component.processEvent(Component.java:6062) at java.awt.Container.processEvent(Container.java:2039) at java.awt.Component.dispatchEventImpl(Component.java:4660) at java.awt.Container.dispatchEventImpl(Container.java:2097) at java.awt.Component.dispatchEvent(Component.java:4488) at java.awt.LightweightDispatcher.retargetMouseEvent(Container.java:4575) at java.awt.LightweightDispatcher.processMouseEvent(Container.java:4236) at java.awt.LightweightDispatcher.dispatchEvent(Container.java:4166) at java.awt.Container.dispatchEventImpl(Container.java:2083) at java.awt.Window.dispatchEventImpl(Window.java:2489) at java.awt.Component.dispatchEvent(Component.java:4488) at java.awt.EventQueue.dispatchEventImpl(EventQueue.java:674) at java.awt.EventQueue.access$400(EventQueue.java:81) at java.awt.EventQueue$2.run(EventQueue.java:633) at java.awt.EventQueue$2.run(EventQueue.java:631) at java.security.AccessController.doPrivileged(Native Method) at java.security.AccessControlContext$1.doIntersectionPrivilege(AccessControlContext.java:87) at java.security.AccessControlContext$1.doIntersectionPrivilege(AccessControlContext.java:98) at java.awt.EventQueue$3.run(EventQueue.java:647) at java.awt.EventQueue$3.run(EventQueue.java:645) at java.security.AccessController.doPrivileged(Native Method) at java.security.AccessControlContext$1.doIntersectionPrivilege(AccessControlContext.java:87) at java.awt.EventQueue.dispatchEvent(EventQueue.java:644) at java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:269) at java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:184) at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:174) at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:169) at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:161) at java.awt.EventDispatchThread.run(EventDispatchThread.java:122) Caused by: java.io.IOException: Invalid header signature; read 0x0010000000060409, expected 0xE11AB1A1E011CFD0 - Your file appears not to be a valid OLE2 document at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:140) at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:115) at org.apache.poi.poifs.filesystem.NPOIFSFileSystem.<init>(NPOIFSFileSystem.java:270) at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:166) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:247) ... 43 more ===================================================================
See TIKA-1487, this is an Excel 4 file which isn't ole2 based
Great, thanks !