Bug 56537 - Using org.apache.poi.ss.usermodel.WorkbookFactory.create(File) leaks file handles
Summary: Using org.apache.poi.ss.usermodel.WorkbookFactory.create(File) leaks file han...
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: POI Overall (show other bugs)
Version: 3.10-FINAL
Hardware: PC All
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
: 56609 56625 (view as bug list)
Depends on:
Blocks:
 
Reported: 2014-05-16 19:13 UTC by Daniel Atallah
Modified: 2014-07-03 05:47 UTC (History)
3 users (show)



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel Atallah 2014-05-16 19:13:53 UTC
According to the API docs for XSSFWorkbook ( http://poi.apache.org/apidocs/org/apache/poi/xssf/usermodel/XSSFWorkbook.html#XSSFWorkbook%28java.lang.String%29 ), "OPCPackage.close()" needs to be called when interaction with the Workbook is complete.

It's not possible to do that when using org.apache.poi.ss.usermodel.WorkbookFactory.create(File), so it presumably leaks in the same way that the deprecated XSSFWorkbook(String) constructor does.

It looks like same issue exists in the HSSFWorkbook() implementation for that method too - leaking the file handle in NPOIFSFileSystem.
Comment 1 Nick Burch 2014-06-04 15:46:20 UTC
WorkbookFactory.create(File) is only about 5 lines of code. If your use case means you need to close the resources explicitly, your best bet is to pull out the key 4 lines, and use them directly, along with adding hooks into your application's own resource tidy-up system.

Given the current simple method signature, it's not possible to return both the Workbook and the Closeable resource
Comment 2 Daniel Atallah 2014-06-04 16:27:24 UTC
Yes, I agree there's a relatively simple workaround (and that's exactly what we've done).  My concern is mainly that it isn't obvious that you'll run into a file handle leak if you use that functionality.  It seems to me that the documentation should be updated and the method should be deprecated since there doesn't appear to be a clean way to fix it.
Comment 3 Nick Burch 2014-06-04 16:36:15 UTC
I don't think we want to deprecate the method without a replacement - for most people it's by far the best way, as it's much lower memory

In r1600326 I've added a note to the javadoc
Comment 4 Daniel Atallah 2014-06-04 17:08:28 UTC
No disagreement at all that it's preferable to use the lower memory functionality, and I'm very grateful to have that capability. :)

My concern continues to be that it's still not obvious that this isn't an appropriate API method to use in e.g. a long running application.  You just find out when you run out of available file handles :)

If the method were deprecated and the documentation had more details about why and how to avoid the problem, it would (should?) be more obvious that there's a "caveat emptor" situation.
Comment 5 Nick Burch 2014-06-04 21:11:20 UTC
I believe that the file handles should get returned when the references get garbage collected, so it shouldn't be the end of the world unless you're handling large numbers of small files

If we can come up with a simple API for WorkbookFactory that allows for opening with a File, and allows for explicit closing, then I'm happy to deprecate the existing one. Suggestions needed though!
Comment 6 Daniel Atallah 2014-06-09 22:01:17 UTC
I'm not sure what kinds of changes would be acceptable to make at this point, but I guess that if Workbook was made Closeable, that could be one way to resolve this.

The Workbook could take ownership of the OPCPackage (and whatever is the corresponding thing for other constructors) and when it's closed, it can clean up.

I suppose that you're right - GC will likely clean up the file handles since ZipFile calls close() in its finalize() method, but it's generally not a good practice to count on GC for such things.
Comment 7 Nick Burch 2014-06-11 13:49:07 UTC
I've had a go at implementing this in r1601901.

Any chance you could give that a whirl, and report back if that does what you were expecting?

If so, we can update the javadocs for WorkbookFactory and XSSFWorkbook.open(String), along with knocking up a quick unit test
Comment 8 Daniel Atallah 2014-06-12 19:06:43 UTC
(In reply to Nick Burch from comment #7)
> I've had a go at implementing this in r1601901.
> 
> Any chance you could give that a whirl, and report back if that does what
> you were expecting?
> 
> If so, we can update the javadocs for WorkbookFactory and
> XSSFWorkbook.open(String), along with knocking up a quick unit test

Thanks.  I looked at the commit and it made sense to me.

The only caveat that I can think of is if there's a use-case where someone would want to re-use the OPCPackage and now wouldn't be able to.
Is it even possible to reuse the OPCPackage currently?

I haven't built poi from source before, so there's a bit of a learning curve for me to do that.
Comment 9 Nick Burch 2014-06-12 19:11:47 UTC
There's nothing stopping you opening one NPOIFSFileSystem or OPCPackage from a file, then re-using that to open multiple Workbook instances that you then change + save elsewhere.

That said, in that case you'd probably be creating the low level object first, then passing that to the Workbook, so you would presumably be handling the close explicitly yourself
Comment 10 Daniel Atallah 2014-06-12 19:18:55 UTC
(In reply to Nick Burch from comment #9)
> There's nothing stopping you opening one NPOIFSFileSystem or OPCPackage from
> a file, then re-using that to open multiple Workbook instances that you then
> change + save elsewhere.
> 
> That said, in that case you'd probably be creating the low level object
> first, then passing that to the Workbook, so you would presumably be
> handling the close explicitly yourself

Aha, that makes sense.  Now that you mention it, the javadoc changes to XSSFWorkbook make it clear that you can do that.
Comment 11 Nick Burch 2014-06-15 15:26:40 UTC
*** Bug 56625 has been marked as a duplicate of this bug. ***
Comment 12 Dominik Stadler 2014-07-02 15:03:02 UTC
*** Bug 56609 has been marked as a duplicate of this bug. ***
Comment 13 Dominik Stadler 2014-07-02 15:07:13 UTC
I have updated the javadoc to reflect the Closeable Workbook, I think this is done now, please reopen if there is still work pending here.
Comment 14 Nick Burch 2014-07-03 05:47:32 UTC
Thanks Dominik!

Two final little bits I've just spotted (+done):

The changelog needs updating to reflect the change, which I've done in r1607537.

The older path based constructor no longer needs to be deprecated, as you can now close the resources explicitly, updated with new javadoc text in r1607536.