Bug 59727

Summary: improving performance when adding a merged region or array formula
Product: POI Reporter: Javen O'Neal <onealj>
Component: SS CommonAssignee: POI Developers List <dev>
Status: NEW ---    
Severity: enhancement    
Priority: P2    
Version: 3.15-dev   
Target Milestone: ---   
Hardware: PC   
OS: All   

Description Javen O'Neal 2016-06-19 22:45:57 UTC
There have been several performance-related bugs* opened for adding a merged regions or array formula.
This may be slow because array formulas and merged regions cannot intersect, so adding either requires checking all existing array formulas and merged regions.

XSSFSheet#validateArrayFormulas loops over each of the cells in a merged region, gets the array formula region that the cell belongs to and determines if it intersects.

If there are few array formulas in a sheet or if the merged region or array formula is large, it may be faster to loop over the array formulas in the sheet than looping over the cells where the intersection would occur. Right now we're looping over cells and looping over the merged regions list, so there's probably room for improvement here.

If the intersection code was slow relative to CellRangeAddress.hashCode+equals, then it would be cheaper to add each checked address to a set--fortunately the intersection code is about the same cost as equals, so this set approach would actually be slower.

* related:
bug 55280: Poor (Blocking) Performance of Merged Regions
bug 59212: Do not check for overlapping regions when adding merged regions to a sheet
bug 58885: Performance regression on addition of merged regions