Bug 52549

Summary: scanning HandlesTypes causes aggressive classloading
Product: Tomcat 7 Reporter: Costin Leau <costin.leau>
Component: CatalinaAssignee: Tomcat Developers Mailing List <dev>
Status: RESOLVED DUPLICATE    
Severity: normal    
Priority: P2    
Version: 7.0.25   
Target Milestone: ---   
Hardware: PC   
OS: All   

Description Costin Leau 2012-01-29 11:21:47 UTC
I've ran into, what I would consider a bug, in Tomcat 7 when the web.xml
is 3.0 (or higher).
I assume based on the Servlet 3.0 spec, the WEB-INF/classes need to be
scanned but rather than doing bytecode parsing, Tomcat 7 does actual
class loading during the webapp initialization. 
This change in semantics breaks applications that rely on bytecode enhancements or processing (such as Spring's LoadTimeWeaver). Also any statics that are in place get initialized way too early even if class itself might not get used.

Webapps that work on Tomcat 5.x-7.x (with web.xml 2.5) suddenly break on Tomcat 7 web.xml 3.0 due to the eager class loading.

I'd assume every app that does instrumentation (such as JPA providers) will face
the same issue unless the whole VM is being instrumented which is quite unfortunate and avoidable.

I'm using Tomcat 7.0.25.
The culprit seems to be ContextConfg#checkHandlesTypes(JavaClass) which could postpone class loading:

// No choice but to load the class
String className = javaClass.getClassName();
...
clazz = context.getLoader().getClassLoader().loadClass(className);
...
// CL: no need to load the class for this
if (clazz.isAnnotation()) {
    // Skip
    return;
}

for (Map.Entry<Class<?>, Set<ServletContainerInitializer>> entry ...


There are a number of improvements to be applied here all just by looking at
the bytecode such as:

a. if the class is an annotation, skip it
b. if the class doesn't extend/implement any interface skip it
c. Look at the class hierarchy - this is actually quite easy (since
there's only one parent) and don't load it unless it implements
ServletContextListener
d. if there are no Servlet initializers, don't load any classes
e. if the class needs to be loaded use a throwaway classloader - that is
a clone CL of the real one which you can discard after scanning. Thus
you can do all the checks against a class but you can get rid of it at
the end. If the class is a match you can load it using the "proper"
class loader.
The problem with e) is that it's not really efficient especially in
terms of memory.

Loading all the classes (which can be quite a lot (10K+) in several
applications) to find one or two initializers seems like a bad trade-off
which unfortunately, also breaks compatibility.
I realize that the solutions above (especially e) seem complicated but
they aren't. I see you guys have used BCEL - if you were using ASM I
would have offered help.

Basically what I'm suggesting is to be a lot more careful in doing
loading and enforcing some basic rules which can go a long way. Also using a
cache (reusing data) for the entire scanning should speed things up
pretty well. Further more since you are already loading the bytecode,
doing additional checks will actually speed things up as it will avoid
class loading.
Case in point is traversing the class hierarchy: if the parent is in the
classpath, it will be scanned anyway and checking the interfaces
implemented is trivial. If this result is cached, all direct children
will be skipped right away.

Thanks,
Comment 1 Costin Leau 2012-01-29 11:40:18 UTC
Example bug report caused by the side effect of eager classloading in Tomcat 7:
https://jira.springsource.org/browse/SPR-7440

Bug 52326 and bug 52444 touch on the same issue as well.

P.S. I'm aware that metadata-complete="true" fixes the issue but it's actually a work-around not a fix. First it is false by default and not many users know about it, and second, it disables the use of annotations which means one can't use annotated ServletContainerInitializer.
Comment 2 Mark Thomas 2012-01-29 12:15:57 UTC

*** This bug has been marked as a duplicate of bug 52444 ***
Comment 3 Costin Leau 2012-01-29 14:26:45 UTC
I'm not sure why this issue has been marked as a duplicate. This is not about long startup times or memory consumption, but rather unneeded class loading that simply breaks existing apps.
Thus it's not about performance but semantics.
Comment 4 Mark Thomas 2012-01-29 17:45:17 UTC
Just for the record, you appear to have missed the point of this code.

The list of ServletContainerInitializer is obtained from META-INF/services/javax.servlet.ServletContainerInitializer within each JAR, not from scanning the classes and looking for classes that implement it.

The scanning is only done if there is at least one ServletContainerInitializer that defines HandlesTypes and the scanning is looking for classes that extend or implement the classes/interfaces defined by HandlesTypes. Loading the class was a quick and dirty solution (that has survived longer than I thought it might) to determining if the class meets the extends or implements test.

See the duplicate for my comments on your suggestions. Short version this is doable with some refactoring and next on my todo list.

Feel free to change the duplicate to a bug if you wish. I'm not that bothered since it is getting fixed anyway.
Comment 5 Costin Leau 2012-01-29 21:47:04 UTC
You're right, I mixed the various types that trigger/are part of the scanning process but hopefully my suggestions (to try to eliminate loading by looking at the class content/dependencies inferred from the configuration) were understood.