Summary: | getHostAddress called unexpectedly, causing significant performance hit | ||
---|---|---|---|
Product: | POI | Reporter: | Jamie <jamie> |
Component: | XWPF | Assignee: | POI Developers List <dev> |
Status: | RESOLVED WONTFIX | ||
Severity: | normal | ||
Priority: | P2 | ||
Version: | 4.1.1-FINAL | ||
Target Milestone: | --- | ||
Hardware: | PC | ||
OS: | Linux | ||
Attachments: | stack trace |
in your stack trace, it appears to be org.apache.catalina.loader.WebappClassLoaderBase that is using the HashSet - not XMLBeans or POI code I'm not sure it would help but it might be useful if we added some options to XMLBeans to get it to configure the SAX parser not to read external files at all My apologies. I guess I was skimming the stack too quick and missed that. Yes, it would be a great help if there was an option not to read external files. It would beespecially useful when performing text extraction on older documents for which the external references are likely to no longer exist. It could also be beneficial if some sort of parsing timeout could be implemented. |
Created attachment 36923 [details] stack trace Our server uses POI for text extraction. When processing some documents, there is a deterioration in performance due to unexpected call to URLStreamHandler.getHostAddress(). .Please refer to the attached stack for an illustration of how this happens. It is due to a known oddity in the way that URL hashCode is implemented whereby it actually attempt to resolve a URL for equality testing purposes. A possible workaround is use the URI class instead of URL?