Bug 60288

Summary: Quadratic complexity: org.apache.poi.POIXMLDocumentPart#findExistingRelation(POIXMLDocumentPart part)
Product: POI Reporter: Tim Helmstedt <tim.helmstedt>
Component: XSSFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: enhancement    
Priority: P2    
Version: 3.14-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Attachments: Test Case

Description Tim Helmstedt 2016-10-21 06:23:57 UTC
Created attachment 34395 [details]
Test Case

This was found through adding 2000 images to a sheet via org.apache.poi.xssf.usermodel.XSSFWorkbook#addPicture(byte[], int).  This isn't the only path to findExistingRelation so there's likely more ways to hit this issue. Attached is a test case which exercises this path.

The source of the quadratic slowdown comes from the defensive copy we do in org.apache.poi.openxml4j.opc.PackagePart#getRelationships. In findExistingRelation we iterate over this copy, and for each element we end up incurring another defensive copy via getRelatedPart->isRelationshipExists.

I can't see a clear reason why org.apache.poi.openxml4j.opc.PackagePart#isRelationshipExists should create a copy of _relationships, since we're not leaking them - just returning a boolean. Using the field directly solves my issue and won't have any functional impact as we're not filtering at this stage.
Comment 1 Tim Helmstedt 2016-10-21 06:32:25 UTC
Fixed in https://github.com/apache/poi/pull/38 not sure the usual way to get commit access.
Comment 2 Javen O'Neal 2016-10-21 07:35:46 UTC
Thanks for finding this performance inefficiency and submitting a patch. Applied in r1765935. These changes will be included in 3.16 beta 1.

Pull requests are fine.[1]

[1] https://poi.apache.org/guidelines.html#SubmittingPatches