Bug 59734 - Named range creation & lookup are linear in the number of ranges
Summary: Named range creation & lookup are linear in the number of ranges
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: XSSF (show other bugs)
Version: unspecified
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
: 59432 (view as bug list)
Depends on:
Blocks:
 
Reported: 2016-06-20 12:42 UTC by David North
Modified: 2016-10-15 20:26 UTC (History)
1 user (show)



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description David North 2016-06-20 12:42:25 UTC
XSSFName.setNameName and XSSFWorkbook.getName both involve a linear iteration through all names already in the workbook. This is a major performance bottleneck for an application creating lots of named ranges.

I'll shortly attach a suggested patch which does the obvious thing and adds a map to do fast lookup.

I'll also propose deprecating the methods on Workbook which operate in an index-based manner and replacing them with ones based on names.
Comment 1 Javen O'Neal 2016-06-20 18:08:20 UTC
Are you planning on writing a CaseInsensitiveMap or importing https://commons.apache.org/proper/commons-collections/apidocs/org/apache/commons/collections4/map/CaseInsensitiveMap.html

I guess you would also need a (scope (global or sheet), named range name) tuple class as your key, so you could hide the canonical case code there.
Comment 2 David North 2016-06-21 09:07:54 UTC
The rather old patch I'm working from hides that logic in the key, so I'll probably go with that approach.
Comment 3 David North 2016-07-29 13:25:13 UTC
Fixed in r174521

I opted to go with the commons collections route after all.

Existing index-based methods have been deprecated rather than removed.
Comment 4 Javen O'Neal 2016-07-29 16:15:12 UTC
(In reply to David North from comment #3)
> Fixed in r174521
Correction: r1754521
Comment 5 Greg Woolsey 2016-09-29 00:17:11 UTC
Where is the new dependency tracked for end users?  The change log?  For people who don't use Maven or Ivy, it make come as a surprise that running their code suddenly fails with a NoClassDefFound error (it just did for me, running a local project with a local build of POI as a dependency).  I missed the commit message, and had to go find this issue.

It doesn't seem that hard to implement using a normal HashMap, where the desired name is upper or lower-cased between the user call and the map lookup.

It could be a Map<String, Map<String, Name>> construct, where the outer Map is by sheet name.  Global names would be mapped under the null entry.

Request sheet and table names would be looked up in the internal map by upper or lower case cannonicalization (null aware for sheets), and no special tuple or Commons Collections class would be needed.

Was simplicity the only reason for adding the package dependency?
Comment 6 David North 2016-09-29 07:16:13 UTC
Simplicity/good developers don't write what they can steal.

Apologies for not noting the new dependency in the change log - I'll add a note to our release process to make sure this is covered next time round.
Comment 7 Javen O'Neal 2016-09-29 14:18:52 UTC
I added a note on the new dependency to the changelog for the 3.15 release and I think it was mentioned in the release notes.[1]

I realized that this dependency addition should have been added to the release notes after the release had been generated.[2]

[1] https://poi.apache.org/changes.html
[2] https://www.apache.org/dyn/closer.lua/poi/release/RELEASE-NOTES.txt
Comment 8 Dominik Stadler 2016-10-15 20:26:17 UTC
*** Bug 59432 has been marked as a duplicate of this bug. ***