Bug 52448

Summary: Cache jar indexes in WebappClassLoader to speed up resources lookup [PATCH]
Product: Tomcat 7 Reporter: Konstantin Kolinko <knst.kolinko>
Component: CatalinaAssignee: Tomcat Developers Mailing List <dev>
Status: RESOLVED WONTFIX    
Severity: enhancement    
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: Windows XP   
Attachments: 2010-03-22_tc6_CachingJarEntries_draft.patch

Description Konstantin Kolinko 2012-01-11 01:39:51 UTC
Created attachment 28135 [details]
2010-03-22_tc6_CachingJarEntries_draft.patch

Here is an old draft patch for WebappClassLoader that may worth looking at considering recent class loader discussion.

It addresses the problem with WebappClassLoader#findResourceInternal() that is caused by these two factors:

1. The findResourceInternal method has a loop over JAR files of the web application. To find a resource it asks each JAR file whether the resource is present in it.

The lookup in each of JAR files should be fast, because a zip file has an index (at the end of the file). The problem is that loading and parsing the index into a hash table needs some time.

2. There is method WebappClassLoader#closeJARs() that closes the JARs and thus unloads those indexes. The method is called:
- once webapp starts, in StandardContext#startInternal()
[[[
            ((WebappLoader) getLoader()).closeJARs(true);
]]]
- once in a while in background thread, WebappLoader#backgroundProcess()
[[[
            closeJARs(false);
]]]

The second call (force:=false) does JARs unloading once there was no activity for 90000 msecs = 1,5 minutes (hardcoded value).


The problem is that when a web application has a lot of JARs in WEB-INF/lib, and you have just started it. When you access its first JSP page the WebappClassLoader has find some class or resource.

To do that it has to open all JARs and scan them for a resource (a class), even if the resources is in only one JAR file, or not in them at all - if it comes from the parent classloader. This opening takes time and is noticeable.

The idea in the patch is to do JAR scanning while the application starts up and all the JARs are open (before the closeJARs(true) call) and cache the names found in the JAR files.

The #findResourceInternal() call uses the cache to open a single JAR that actually contains the resource.


The patch is against r922264 of tc6.0.x. It contains some remnants of LockAwareURLClassLoader - see https://issues.apache.org/bugzilla/show_bug.cgi?id=48903#c6
Those have to be ignored. The essence is the jarEntryNames field and openJAR(int index) method that populates it.
Comment 1 Mark Thomas 2018-06-06 19:52:18 UTC
There hasn't been any intererst in follow up on this so I am closing this as WONTFIX.

I'll note that the resources implementation changed in 8.0.x and generally, where performance concerns have been raised, they have been raised with 8.0.x onwards rather than 7.0.x