Bug 57296 - ZipInputStreamZipEntrySource.java:61 should catch IOExceptions on "close" after successfully reading
Summary: ZipInputStreamZipEntrySource.java:61 should catch IOExceptions on "close" aft...
Status: NEEDINFO
Alias: None
Product: POI
Classification: Unclassified
Component: OPC (show other bugs)
Version: 3.11-dev
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-12-02 15:12 UTC by Torsten Krah
Modified: 2017-02-17 11:52 UTC (History)
0 users



Attachments
JUnit Testcase (4.43 KB, application/x-gzip)
2014-12-08 10:32 UTC, Torsten Krah
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Torsten Krah 2014-12-02 15:12:54 UTC
Beginning with 1.7.0_71 closing a CipherInputStream may result in a BadPaddingException on closing the stream, signaling there maybe still data left to read.

Got this in my trace:

Caused by: java.io.IOException: javax.crypto.BadPaddingException: Given final block not properly padded
	at javax.crypto.CipherInputStream.close(CipherInputStream.java:321) ~[na:1.8.0_25]
	at java.io.BufferedInputStream.close(BufferedInputStream.java:483) ~[na:1.8.0_25]
	at java.io.PushbackInputStream.close(PushbackInputStream.java:379) ~[na:1.8.0_25]
	at java.util.zip.InflaterInputStream.close(InflaterInputStream.java:227) ~[na:1.8.0_25]
	at java.util.zip.ZipInputStream.close(ZipInputStream.java:266) ~[na:1.8.0_25]
	at org.apache.poi.openxml4j.util.ZipInputStreamZipEntrySource.<init>(ZipInputStreamZipEntrySource.java:61) ~[poi-ooxml-3.10-FINAL-20140208.jar:3.10-FINAL]

This was changed with 71 and was silently ignored before, look here:

https://bugs.openjdk.java.net/browse/JDK-8061

Looking at the code it seems to me, that on the point where the stream is closed" all is "done" and any IOException on "closing" the stream can be ignored because there is no ZipEntry left in the ZipInputStreamZipEntrySource, correct?
Can you please add some fix here, so that close is something like:

try {
 inp.close();
} catch(IOException e) {
 // ignore or log it
}
Comment 1 Torsten Krah 2014-12-02 15:14:10 UTC
Upstream Link is incomplete:

https://bugs.openjdk.java.net/browse/JDK-8061619
Comment 2 Andreas Beeker 2014-12-05 21:59:19 UTC
I've update to Oracle JDK 7u72 and the decompiled CipherInputStream seems to be similar to the openjdk version. The ooxml based junit tests (TestDecryptor,TestAgileEncryptionParameters) passed, even when I've added a close() to TestDecryptor.zipOk().

Please attach your test file, so I can reproduce it.
Comment 3 Torsten Krah 2014-12-08 09:53:48 UTC
Got a few of those documents which are showing this exception - however i am unable / not allowed to share those customer documents.
All common to them is that its docx files - still searching a way to create such a failing file which i am able to share.
Comment 4 Torsten Krah 2014-12-08 10:32:47 UTC
Created attachment 32267 [details]
JUnit Testcase

Includes Java-Testcode, the failing document and the JCEKS keystore used to encrypt/decrypt
Comment 5 Torsten Krah 2014-12-08 10:41:40 UTC
AddOn: JCE-Extensions needs to be installed, its a 256 Bit AES-Key.
Comment 6 Andreas Beeker 2014-12-08 16:24:07 UTC
At your first post, I thought there's something wrong with the agile decryption routines, which use a "NoPadding" specifier - so BadPaddingException shouldn't be thrown on close() ... but who knows ...

Looking at your test/use case now, it seems that you aren't handling password protected files (as in "a password prompt opens in Word"), but unwrap them yourself with a private key.

In this case you create the CipherInputStream and need to make sure that the Extractor don't need to handle cipher-based exceptions on close(), e.g. by using a custom FilterInputStream

Feel free to reopen the bug with a password protected file and I'll have look into it.
Comment 7 Torsten Krah 2014-12-09 09:39:14 UTC
The ticket is about a more robust usage of the close method and the exception handling of the provided InputStream - it has nothing to do in the first place about some password protected files or such stuff (didn't write anything about that).

Its afaik good practice that libraries, such as poi, should not close InputStreams they did not create but are provided with.
Its the responsibility of the creator to close it.

In this case if poi would not close those stream i could work with the Extractor:

################
InputStream in = ...

Use poi stuff

close the stream and handle exception if needed
################

I would know exception would be at "close" time and not at reading the stream.

Currently POI closes the stram and i don't know that it was only at close time at runtime - i am only getting an IOException from POI - i don't know why without doing stack rewinding to look if it was from a read or close call.
The close call could be ignored in my case - the read of cause can't be ignore because it would be an invalid TextExtractor instance.

But POI does close the stream after it does successfully read all data from the stream.

The Interface from close from the InputStream does specify an IOException may be thrown - so any Stream provided may throw such an exception (the BadBadding one was only an example where this does happen now with the JDK upstream change) - that's why the subject of the ticket is called:

"should catch IOExceptions on "close" after successfully reading"

So the question in this ticket is, why POI is not more robust here and handle those exception on close - because what was needed was read as far as i understand the Code i've looked at.

It doesn't make sense to wrap the close in another stream (which i am currently doing as a workaround until this is hopefully addressed) and ignore it there because in this case you would need to tell all users if they ever use this method from the Extractor with an InputStream, that they must provide an InputStream which must not throw an exception when closed, because they won't be able to use POI at all if they can't assure that - which not all users can make sure without wrapping it, because they may be also provided with an InputStream from which they don't know if it may throw an Exception on close - in doubt it may do so, because the Interface does allow it.

So my proposal is:

1. Don't close the stream you did not create, just read from it until you are finished and let the creator of the stream handle the close.

2. If you really want to close it be more robust and handle the exception - it can be ignored imho so you can at least catch and ignore or log it at this place.
Comment 8 Andreas Beeker 2014-12-12 00:05:05 UTC
I don't like the idea of catching the exception and using the log to bury it.

The question is, when we remove the close from the classes, how much is user code affected by or relying on that feature, i.e. when is the stream finalized.
I normally close a stream, even it has been closed inside the poi classes - I think that's common habit.

We could make that handling configurable on a system or static property.

Btw. in my $dayjob, we also had a similar problem, I think it was related to a database stream, so it might be common, that the closing is seen as annoying ...

What do the other think?
Comment 9 Torsten Krah 2014-12-12 10:09:27 UTC
Yeah i make sure streams getting closed too, no matter if a library claims to do it or not - just to be sure not to leak file descriptors, but don't know if everyone does it, hope so.
So yes i know this may be a breaking change for users relying on it, don't know whats the best way to fix this - change and document it perhaps?
Comment 10 Javen O'Neal 2016-10-09 11:40:47 UTC
Potentially fixed by bug 60128 in POI 3.15.
Comment 11 Javen O'Neal 2017-02-17 05:38:27 UTC
(In reply to Javen O'Neal from comment #10)
> Potentially fixed by bug 60128 in POI 3.15.

Torsten, can you verify this (3 years later)?
Comment 12 Torsten Krah 2017-02-17 11:52:26 UTC
POI is still closing the InputStream it did not create instead of just reading from it - so this is still a valid concern if close throws an exception but data needed could be successfully read.
The original case was fixed later in the JDK and at least my root case - the javax.crypto.BadPaddingException on close - is not thrown anymore - but the case itself here to not close the stream it did not create, e.g. like lucene does not close them and mentions that on the api docs:

https://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/hunspell/Dictionary.html

Discussable though.