Bug 65854 - OPCPackage should revert() instead of close() on failed open in some cases
Summary: OPCPackage should revert() instead of close() on failed open in some cases
Alias: None
Product: POI
Classification: Unclassified
Component: OPC (show other bugs)
Version: 5.2.0-FINAL
Hardware: PC Linux
: P2 trivial (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2022-01-27 18:03 UTC by Tim Allison
Modified: 2022-03-20 07:47 UTC (History)
0 users


Note You need to log in before you can comment on or make changes to this bug.
Description Tim Allison 2022-01-27 18:03:08 UTC
Thanks to PJ and Andi and the POI team, we've integrated 5.2.0 into Apache Tika, and we're about to make a release of 2.3.0.

I noticed a bunch of logged warnings: "The close() method is intended to SAVE a package. This package is open in READ ONLY mode, use the revert() method instead!"

We've wrapped our use of OPCPackage so that we're calling revert() everywhere.  

However, in our zip detector, when trying to figure out if a zip is an OPCPackage, we call OPCPackage.open(ZipEntrySource zipEntry), and if there's an exception, we swallow it and figure that zip file is not an OPCPackage.

The problem is that at the POI level, IOUtils.closeQuietly() is called on the pack after it is created in the catch block. 

Is there a cleaner way with POI to determine if a zip file is an OPCPackage?

Should we be calling revert() instead at the POI level to avoid this warning?


    public static OPCPackage open(ZipEntrySource zipEntry) throws InvalidFormatException {
        ZipPackage pack = new ZipPackage(zipEntry, PackageAccess.READ);

        try {
            if (pack.partList == null) {

            return pack;
        } catch (RuntimeException | InvalidFormatException var3) {
            throw var3;
Comment 1 PJ Fanning 2022-01-28 10:43:18 UTC
I changed the logging in r1897562. I don't think this warrants a warning log.
Comment 2 Dominik Stadler 2022-01-30 17:22:06 UTC
Not sure if using "org.apache.poi.poifs.filesystem.FileMagic" would help you to determine the file-type before trying to do actual parsing? It looks at the first few bytes to quickly see if it looks like a zip-file, binary-OLE2 or some other older formats.

In general I would use "revert()" at these points when opening the file fails and we are opening it read-only. I have the relevant changes locally, can add them to this bug later.
Comment 3 Dominik Stadler 2022-03-20 07:47:03 UTC
Applied some changes via r1899071 which uses revert() instead of close() whenever the package was opened read-only.