Bug 59773

Summary: Move loop invariants outside of loop for faster execution
Product: POI Reporter: Javen O'Neal <onealj>
Component: POI OverallAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: enhancement    
Priority: P2    
Version: 3.15-dev   
Target Milestone: ---   
Hardware: Other   
OS: All   

Description Javen O'Neal 2016-06-30 03:13:14 UTC
POI has a fair amount of code that reads
for(int i=0; i<getNumberOfStyles(); i++) {
    // do something
}

The call to getNumberOfStyles is called for every loop iteration. At best this is a lot of function calls. At worst, this is a lot of expensive function calls.

We can pretty easily find cases where we can make POI faster by moving loop invariants (value doesn't change over the for loop, function doesn't have any side-effects) outside the loop.

grep -r --exclude-dir=".svn" -P "for\s*\([^:\(]+\(\)[^:\)]+\)"

This finds function calls within a for loop, excluding for-each loops.

There are currently 514 instances. Some may be for-loops over iterators or functions with side-effects. The rest could probably be made faster without harming readability.

There are likely expressions or functions that are re-evaluated with loops that could be pulled out of an inner loop or loops entirely, but this is a bit trickier to find with a reg ex.
Comment 1 Javen O'Neal 2016-07-02 18:29:21 UTC
Applied some changes (mostly to HSSF and XSSF classes) in r1751086.
Comment 2 Javen O'Neal 2016-07-03 07:21:53 UTC
More changes to SS, XSSF, POI util, and SignatureInfo in r1751131.
Comment 3 Javen O'Neal 2016-07-04 01:50:05 UTC
Updated VisioTextExtractor in r1751193
Comment 4 Javen O'Neal 2016-08-15 05:39:46 UTC
Updated PowerPointExtractor in r1756345
Comment 5 Dominik Stadler 2018-10-21 16:59:28 UTC
I am closing this as resolved for now, we should do some micro-benchmark to verify that more changes are actually improving execution speed, maybe the java vm already can optimize away most of the overhead anyway nowadays.