Bug 63029

Summary: OPCPackage Potentially clobbers files on close()
Product: POI Reporter: Axel Dörfler <a.doerfler>
Component: OPCAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: major    
Priority: P2    
Version: 4.0.x-dev   
Target Milestone: ---   
Hardware: All   
OS: All   
Bug Depends on: 59287    
Bug Blocks:    

Description Axel Dörfler 2018-12-21 10:50:06 UTC
I read an XLSX file this way:
                workBookPackage = OPCPackage.open(path.toFile(), PackageAccess.READ);
                workBook = new XSSFWorkbook(workBookPackage);

I'm using this method, as the documentation says that "Creating a XSSFWorkbook from a file-backed OPC Package has a lower memoryfootprint than an InputStream backed one."

Now my application got interrupted, and this resulted in a zero byte XLSX file. Unfortunately, I don't have the complete stack trace anymore, but I got a `java.nio.channels.ClosedByInterruptException` from FileHelper.copyFile().

Looking at ZipPackage.closeImpl(), it looks like it *always*, and unconditionally clobbers the original file, even if I had used PackageAccess.READ to open the package.

The other issue is, that closeImpl() does not even try to use an atomic move to make replacing the original file saver.
Comment 1 Axel Dörfler 2018-12-21 11:02:07 UTC
I accidentally copied the wrong sources; when the problem appeared, I did not specify PackageAccess.READ in OPCPackage.open().
Comment 2 PJ Fanning 2018-12-21 12:13:58 UTC
can you provide a reproducible test case?
Comment 3 Axel Dörfler 2018-12-21 12:37:43 UTC
Sorry, if I had the time for that, I would have delivered a patch already ;-)

But I've read into the sources a bit more; turns out that only OPCPackage.close() calls into ZipPackage.closeImpl(), and that is *not* called in case of PackageAccess.READ. So that part is a wrong assumption of mine.

I should have called OPCPackage.revert() instead of close() -- not the most intuitive API choice, but my fault in the end.

The only remaining issue is that the file is lost if ZipPackage.closeImpl() is aborted at the wrong time. It really should, if possible, use an atomic move to replace the original file instead.
Comment 4 Axel Dörfler 2018-12-21 12:46:54 UTC
What makes this bug even worse is that the temp file is deleted in any case; ZipPackage.closeImpl() deletes the temp file in a finally block.

Even if an atomic move is not possible (if the temp dir is on a different file system, for example), you can still make it more secure by renaming the original file first, and only delete that in case everything worked out.
Comment 5 Axel Dörfler 2018-12-21 12:56:53 UTC
Only my misuse of the API depends on #59287 -- the actual issue does not, however.
Even when using the API correctly, the file may end up being corrupted at the end.
Comment 6 Yegor Kozlov 2019-02-09 11:54:42 UTC
a unit test to reproduce corruption :

try(OPCPackage pkg = OPCPackage.open(path.toFile(), PackageAccess.READ_WRITE)) {
    // add a marshaller that will throw an exception on save
    pkg.addMarshaller("poi/junit", (part, out) -> {
        throw new RuntimeException("Bugzilla 63029");
    });
    pkg.createPart(PackagingURIHelper.createPartName("/poi/test.xml"), "poi/junit");
} catch (RuntimeException e){
  assert("Bugzilla 63029", e.getMessage());
}

// try to read the source file once again
try ( ZipFile zip = new ZipFile(path.toFile())){
 // throws java.util.zip.ZipException: archive is not a ZIP archive
}
 

an exception while saving *may* result in a clobbered file. The size of the corrupted data depends on how much was saved and flushed on disk: it can be anywhere from zero-byte to the length of the original file.

a simple change to avoid corruption would be to replace the original file only if  save() succeeded. Something like this:

boolean success = false;
try {
  save(tempFile);
  success = true;
} finally {
    // Close the current zip file, so we can overwrite it on all platforms
    IOUtils.closeQuietly(this.zipArchive);
    try {
        // Copy the new file over the old one
        if(success) {
            FileHelper.copyFile(tempFile, targetFile);
        }
    } finally {
        // Either the save operation succeed or not, we delete the temporary file
        if (!tempFile.delete()) {
            LOG.log(POILogger.WARN, "The temporary file: '"
                    + targetFile.getAbsolutePath()
                    + "' cannot be deleted ! Make sure that no other application use it.");
        }
    }
}

this would leave the origin intact in case of an exception.
Comment 7 Yegor Kozlov 2019-02-12 16:56:12 UTC
Fixed in r1853454, a unit test added.

Yegor