Summary: | Review ways to avoid new Integer()/Integer.valueOf() in performance sensitive places | ||
---|---|---|---|
Product: | POI | Reporter: | Dominik Stadler <dominik.stadler> |
Component: | XSSF | Assignee: | POI Developers List <dev> |
Status: | RESOLVED WONTFIX | ||
Severity: | enhancement | ||
Priority: | P2 | ||
Version: | 4.0.x-dev | ||
Target Milestone: | --- | ||
Hardware: | PC | ||
OS: | Linux | ||
Bug Depends on: | 57840 | ||
Bug Blocks: |
Description
Dominik Stadler
2017-11-03 22:21:08 UTC
This article[1] has an interesting set of tests, results, and summary info for multiple primitive collections frameworks, including Trove. Most interesting points to me: * Trove isn't under active development, and hasn't been for a few years. It was also slowest. * Fastutil is the overall winner for performance, although memory wasn't a consideration in these tests. * Fastutil is a 28M JAR file (!) If we are only interested in Map<int, ? extends Object> type maps, perhaps a slimmed-down version of a stable build of Fastutil could be used. It appears license compatible, and we could investigate an AutoJar or similar step added to the build to create an artifact bundled/released with POI. [1] http://java-performance.info/hashmap-overview-jdk-fastutil-goldman-sachs-hppc-koloboke-trove-january-2015/ FYI, some quick and dirty test with the file from bug #57840, not representative as not done in clean environment: * Origin: 3m35s * With new Integer(): 3m22s * With Trove and new Integer(): 2m28s The difference between new Integer() and Integer.valueOf() were very small. Collections like Trove could provide improvements in cases where very huge documents are handled, however I don't plan to invest more effort there for now. My previous tests were with Java 6. With Java 8 (starting with 7, actually) things are a bit different, and I've found even better performance by increasing the size of the Integer cache, either via system property -Djava.lang.Integer.IntegerCache.high=<size> or JVM setting: -XX:AutoBoxCacheMax=<size> this makes Integer.valueOf(int) the fastest option. I use a value greater than my expected maximum row count, typically 50000 to 100000, although I've not noticed any negative impacts of using 1000000 other than the expected time to load and memory to cache those simple objects. I think rather than introduce a new dependency we stick with Integer.valueOf(int) now that we've moved to Java 8. |