Bug 65452 - NotOLE2FileException not thrown in POI 5.0.0 by opening an XML-RAW File with WorkbookFactory.create()
Summary: NotOLE2FileException not thrown in POI 5.0.0 by opening an XML-RAW File with ...
Status: NEW
Alias: None
Product: POI
Classification: Unclassified
Component: POIFS (show other bugs)
Version: 5.0.0-FINAL
Hardware: PC All
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-07-15 14:06 UTC by johannes.summerer
Modified: 2021-07-15 14:13 UTC (History)
0 users



Attachments
An Example File for this case (2.42 KB, application/vnd.ms-excel)
2021-07-15 14:06 UTC, johannes.summerer
Details

Note You need to log in before you can comment on or make changes to this bug.
Description johannes.summerer 2021-07-15 14:06:34 UTC
Created attachment 37957 [details]
An Example File for this case

Hi everybody,

We use poi lib to consolidate different Excel files from different sources. Among other filetypes, RAW-XML files with the extension .xls are also available as a special case.

These files were catched with a try-catch and the exception NotOLE2FileException. On the catch part we passed on that stuff to Tika for further processing. Up to version 4.1.2, this worked very well.

Unfortunately, as of POI 5.0.0, the WorkbookFactory no longer throws this exception in the respective error case. Much more is delivered back a null value instead of an workbook.

I was able to see a change in the FileMagic class, which seems to be used for this.

The constants OOXML_FILE_HEADER and RAW_XML_FILE_HEADER from POIFSConstants no longer exist. 
Therefor the values are given directly to the ENUMS in the FileMagic class. But the type of the value is no longer an array of bytes.

it may be possible that this is causing the error, but i'm not sure.

Anyway, the enums BIFF2 and BIFF3 is also changing the explizit declariton to the type byte[].

Thx for any help in advance!

Code Snip:

Workbook myWorkBook;
File xls = new File("Example.XLS");

try {
        myWorkBook = WorkbookFactory.create(xls);
      } catch (NotOLE2FileException ex) {
        if (ex.getMessage().contains("The supplied data appears to be a raw XML file")) {
          return MyTika.parseHTMLandXMLTable(xls);
        } else if (ex.getMessage().contains("Invalid header signature")) {
          return MyTika.parseHTMLandXMLTable(xls);
        } else throw ex;
      }