Bug 55317 - Facilitate weaving by allowing ClassFileTransformer to be added to WebppClassLoader
Summary: Facilitate weaving by allowing ClassFileTransformer to be added to WebppClass...
Status: RESOLVED FIXED
Alias: None
Product: Tomcat 7
Classification: Unclassified
Component: Catalina (show other bugs)
Version: trunk
Hardware: All All
: P2 enhancement (vote)
Target Milestone: ---
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-07-28 07:17 UTC by Nick Williams
Modified: 2015-07-17 08:02 UTC (History)
0 users



Attachments
Proposal for transformer registration API (2.05 KB, patch)
2013-08-06 06:08 UTC, Jeremy Boynes
Details | Diff
Proposed implementation of this feature (38.45 KB, patch)
2013-08-21 23:15 UTC, Nick Williams
Details | Diff
Proposed implementation of this feature (37.74 KB, patch)
2013-08-21 23:21 UTC, Nick Williams
Details | Diff
Proposed implementation of this feature (40.41 KB, patch)
2013-09-12 19:30 UTC, Nick Williams
Details | Diff
Proposed implementation of this feature (38.93 KB, patch)
2013-09-14 02:30 UTC, Nick Williams
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Nick Williams 2013-07-28 07:17:28 UTC
Most application servers provide a mechanism whereby web applications can instrument class bytecode. This is normally in the form of two or three instrumentation methods on the ClassLoader implementation responsible for web application class loading. Spring Framework, for example, provides a way to automatically detect and use the various instrumentable ClassLoaders provided by different application servers. However, since Tomcat's o.a.catalina.loader.WebappClassLoader is not instrumentable, Spring cannot use it directly. Instead, it provides a class that extends WebappClassLoader to implement the instrumentation. Users must place this special spring-instrument-tomcat.jar file in $TOMCAT_HOME/lib and put a loader declaration in META-INF/context.xml in order to get this to work--rather a pain in the rear.

Spring's changes in the extended WebappClassLoader are really very trivial and involve only 2 classes: a helper class called WeavingTransformer [1] and the extended TomcatInstrumentableClassLoader [2]. I intend to port these changes to Tomcat for Tomcat 7 and 8 and submit a patch.

However, before I get started I needed to get two answers from the community:

1) Is there any objection to such a simple addition to WebappClassLoader? I think it would be an easy-but-big improvement to Tomcat.

2) What steps do I need to take to get this Spring code officially "donated" to Tomcat so that all legal requirements are met? The code is licensed Apache 2.0, so that at least should make it easier.

[1] https://github.com/SpringSource/spring-framework/blob/master/spring-instrument-tomcat/src/main/java/org/springframework/instrument/classloading/WeavingTransformer.java
[2] https://github.com/SpringSource/spring-framework/blob/master/spring-instrument-tomcat/src/main/java/org/springframework/instrument/classloading/tomcat/TomcatInstrumentableClassLoader.java
Comment 1 Mark Thomas 2013-07-28 07:52:25 UTC
I have no objections in principle but obviously I'd want to see the proposed patch first.

I will say that someone from the Spring project will need to confirm that any proposed patch is a) correct and b) helpful.
Comment 2 Rainer Jung 2013-07-28 08:15:31 UTC
Why would one need to extend the webapp class loader? A quick look to java.lang.instrument indicates that one should be able to do it with a normal instrumentation agent. Can you clarify?

The registered ClassFileTransformers will get the classloader passed in in their transform() call so can decide whether it is a webapp classloader and it wants to act on the class.

Probably it would be nice if there were an easy way to associate the webapp with the WebappClassLoader, or at least the WebappLoader could be retrieved from the WebappClassLoader. From there one can get the necessary info about the webapp via the container.
Comment 3 Rainer Jung 2013-07-28 09:25:09 UTC
Sorry, me again: if we want to support it in an agent free way, it seems it woul suffice to add a addTransformer() (or probably better registerTransformer() and unregisterTranformer()) call to the WebappClassLoader that maintains a list of added tranformers and calls them in findResourceInternal() similar to the original spring code. The original spring code calls all of them in turn until the first one succeeds. Don't know whether it would be better to proceed calling in registration order to allow mutliple transformations to occur.

There's no real need to support the proprietry getThrowawayClassLoader() method, which makes most of the lines unnecessary and we'll also not need the WeavingTransformer because we can call the registered Transformers directly.

As far as I can see spring can support the getThrowawayClassLoader() on top of that without subclassing our class loader.

WDYT?
Comment 4 Nick Williams 2013-07-28 15:15:08 UTC
First, here's the companion Spring Framework enhancement request: https://jira.springsource.org/browse/SPR-10788

/I/ don't need to extend WebappClassLoader. I plan on simply adding to the existing WebappClassLoader. Spring Framework extends it because that's the only way to make it work without changing Tomcat directly. The reason any changes are needed at all is because a standard ClassLoader does not just support instrumentation out-of-the-box. You must instrument a class file /as it is being loaded/, before it is initialized/instantiated.

I like the idea of creating registerTransformer() and unregisterTranformer() instead of addTransformer(), but that is a deviation from the unofficial convention that other containers abide by. If Mark is also okay with registerTransformer() and unregisterTranformer(), I'll take that approach.

I'm not exactly sure what getThrowawayClassLoader() is for, either. I have asked some Spring folks to clarify. It exists consistently across containers and Spring container-support classes, so I assume it must be important. We'll see.
Comment 5 Jeremy Boynes 2013-07-28 17:37:08 UTC
I have reservations around the security consequences of providing anything that has access to the webapp classloader the ability to modify any code defined by that classloader. 

Currently, that has to be specifically enabled by adding an agent to the JVM or by adding special classloader to the container installation and enabling it in a web application's Context. This change would automatically enable this for all applications. Further, if an application was enabled for cross-context dispatch it would also be able to register a transformer to modify the code of other applications.

Some sort of check seems needed here - for example, that transformers can only transform classes for their own web application where permission has been explicitly granted (perhaps based on ProtectionDomain). I'd think the ProtectionDomain should also be passed to any ClassFileTransformer called.
Comment 6 Jeremy Boynes 2013-07-28 17:47:22 UTC
How does this interact with annotation scanning and ServletContainerInitializers?

For example, a transformer could add or remove annotations or class dependencies (e.g. adding new interfaces) that related to component scans or types handled by SCIs. Tomcat reads the bytecode directly for this, which would be different to bytecode returned from a transformed class and could result in ambiguous or inconsistent application behaviour.
Comment 7 Jeremy Boynes 2013-07-28 17:57:30 UTC
Updated summary to indicate this is more about application driven bytecode weaving rather than Instrumentation (in the way the JRE defines instrumentation).
Comment 8 Nick Williams 2013-07-28 18:52:19 UTC
To address Jeremy's concerns/questions:

1) I don't see a security issue. JBoss, GlassFish, WebLogic, and WebSphere, for example, all "automatically enable this for all applications." You say, "Currently, that has to be specifically enabled by adding an agent to the JVM or by adding special classloader to the container installation and enabling it in a web application's Context," and I say that's a bad thing. It makes weaving difficult to configure, and sometimes impossible in hosted environments where the consumer doesn't have the ability to jump through these hoops.

2) I took some time to peruse the code on GlassFish, particularly  org.glassfish.web.loader.WebappClassLoader. It performs no special security checks. If someone calls addTransformer, it just lets them add the transformer. That's all. It doesn't use a ProtectionDomain, and it passes null to the transformer's transform method. JBoss also performs no security checks in its code, though it does pass the existing ProtectionDomain to the transformer. It's up to the transformer to use or not use the ProtectionDomain, but apparently it's allowed to be null.

3) If a user enables cross-context dispatch (off by default), they don't do so accidentally (they may do so ignorantly, but that's not our concern). Enabling cross-context dispatch already has its own whole set of security concerns, and I consider the possibilities with this minor compared to those other security concerns.

4) I don't have any particular objection to some sort of Tomcat setting (maybe in catalina.properties?) that disables/disallows transformation, though I strongly believe it should be enabled by default.

5) "How does this interact with annotation scanning and ServletContainerInitializers?" All of this scanning happens /before/ any application code is allowed to execute. As such, it happens before any code is capable of adding transformers. Thus, this won't affect such scanning at all. Transformers are only going to want to instrument classes that they use--that won't be SCIs. As far as adding/removing annotations, the creator of a transformer to be used in a web application knows better than to attempt to add or remove annotations that might be scanned for by the container. None of the other containers do anything to anticipate/detect/prevent this bad programming practice, which wouldn't work but also wouldn't hurt anything.

Now, to address what Rainer was saying about getThrowawayClassLoader(). After reading some GlassFish and JBoss source code, I've figured out what it's for. Code that adds transformers to a ClassLoader may need to initially load a class to determine whether it requires transformation (for example, Spring looks for classes annotated @Configurable and only adds a transformer for them if it finds a class annotated @Configurable). GlassFish implements this with a copy() method that simply returns a new UrlClassLoader with the same URL and parent class loader as the WebappClassLoader.
Comment 9 Nick Williams 2013-08-03 16:35:22 UTC
I've received some guidance from the folks over at SpringSource (https://jira.springsource.org/browse/SPR-10788) and I'd like to get started on a patch for this now. However, I've yet to hear from Mark whether he has a preference for adding a single addTransformer() method as I had originally suggested or adding registerTransformer() and unregisterTranformer() methods as Rainer suggested (or, a third alternative, addTransformer() and removeTransformer()).

Mark, what do you think? If you don't have a preference, I'll probably go Rainer's route but with addTransformer() and removeTransformer() as it's more flexible but also uses a convention (addTransformer()) that other containers use.

Thoughts?
Comment 10 Mark Thomas 2013-08-05 14:27:46 UTC
(In reply to Nick Williams from comment #9)

> Mark, what do you think?

My primary concern has already been articulated in comment #1.

> If you don't have a preference, I'll probably go
> Rainer's route but with addTransformer() and removeTransformer() as it's
> more flexible but also uses a convention (addTransformer()) that other
> containers use.

I have a very small preference for addTransformer() and removeTransformer() but I'm happy to defer to the preferences of the Spring folks.

Mark
Comment 11 Nick Williams 2013-08-05 17:42:07 UTC
Okeydokey. addTransformer() and removeTransformer() is my preference and also what's easiest on the Spring folks (it will require no code change on their part to work), so that's the direction we'll go.
Comment 12 Nick Williams 2013-08-05 18:08:03 UTC
I noticed WebappClassLoader's toString() method includes a lot of information about the ClassLoader, such as the parent ClassLoader, context name, delegation setting, and repository. Should I also change toString() to append information about any attached transformers?
Comment 13 Jeremy Boynes 2013-08-06 06:08:19 UTC
Created attachment 30678 [details]
Proposal for transformer registration API

(In reply to Nick Williams from comment #11)
> Okeydokey. addTransformer() and removeTransformer() is my preference and
> also what's easiest on the Spring folks (it will require no code change on
> their part to work), so that's the direction we'll go.

I'd suggest adding these in a separate interface for just this functionality so that applications don't need to depend on the actual implementation. The Tomcat API package (org.apache.tomcat) might be a good location.
Comment 14 Nick Williams 2013-08-21 23:15:45 UTC
Created attachment 30748 [details]
Proposed implementation of this feature

I have attached a patch with the implementation complete. In addition to 10 thorough unit tests, my functional testing with Spring Framework indicates that it works great.

The only remaining question I have, which may require a slight tweak, is regarding memory leaks. I can't figure out whether transformers added to the WebappClassLoader (which will come from application classes) constitute a potential memory leak. It seems to me that the WebappClassLoader instance is cleaned up when the application is shut down, so it shouldn't be a problem. But I wanted to get someone's opinion on whether the WebappClassLoader should clear out the contents of the "transformers" LinkedList when it shuts down. I'm not convinced it's necessary, so I left it out for now.
Comment 15 Nick Williams 2013-08-21 23:21:04 UTC
Created attachment 30749 [details]
Proposed implementation of this feature

There was an error in my previous patch. Please disregard and accept this one instead.
Comment 16 Mark Thomas 2013-08-29 09:28:17 UTC
(In reply to Nick Williams from comment #15)
> Created attachment 30749 [details]
> Proposed implementation of this feature

1. Why loop over list rather than using contains() in addTransformer() ?

2. Should addTransformer() be looking for multiple instances of the same Transformer or multiple instances of the same class of Transformer?

3. Why not use List.remove(Object) in removeTransformer() ?

4. I'm concerned that synchronizing on the list of transformers while classes are transformed will become a bottleneck when lots of classes are being loaded and the transformer is relatively slow. A separate ReadWriteLock for the transformer list is probably the way to go but really some testing is required to determine if there is an issue here or not.

5. I'm not a fan of the org.apache.tomcat.unittest package unless the classes concerned are going to be used by multiple tests across multiple packages.
Comment 17 Nick Williams 2013-09-12 19:30:44 UTC
Created attachment 30825 [details]
Proposed implementation of this feature

(In reply to Mark Thomas from comment #16)
> 1. Why loop over list rather than using contains() in addTransformer() ?
>  
> 3. Why not use List.remove(Object) in removeTransformer() ?

That was my own mistake. I didn't read the Javadoc properly. I have corrected this in the attached revised patch.

> 4. I'm concerned that synchronizing on the list of transformers while classes are transformed will become a bottleneck when lots of classes are being loaded and the transformer is relatively slow. A separate ReadWriteLock for the transformer list is probably the way to go but really some testing is required to determine if there is an issue here or not.

Another mistake of mine. This could DEFINITELY be a problem if multiple threads are loading classes at the same time. A ReadWriteLock is definitely preferable over synchronization here. I have corrected this in the attached revised patch.

> 5. I'm not a fan of the org.apache.tomcat.unittest package unless the classes concerned are going to be used by multiple tests across multiple packages.

Understood. I have relocated/renamed these two classes in accordance with the discussion on the mailing list. The changes are in the attached revised patch.

> 2. Should addTransformer() be looking for multiple instances of the same Transformer or multiple instances of the same class of Transformer?

No, this was correct. It could be valid to have multiple instances of the same transformer class, but not multiple copies of the same instance.

An example use case is an application using JPA with Spring Framework. JPA abstracts away from the java.lang.instrument.ClassFileTransformer by specifying a javax.persistence.spi.ClassTransformer. JPA providers add ClassTransformers to the persistence unit instead of ClassFileTransformers. Applying this directly would require the ClassLoader to support javax.persistence.spi.ClassTransformer, which won't work in many cases (such as Tomcat). To get around this, Spring Framework uses a org.springframework.orm.jpa.persistenceunit.ClassFileTransformerAdapter to wrap a ClassTransformer with a ClassFileTransformer. If a provider adds multiple ClassTransformer implementations to the persistence unit, Spring will in turn add multiple ClassFileTransformerAdapter instances to the ClassLoader. All of these instances will do something different, but they will be of the same class as far as WebappClassLoader can tell.
Comment 18 Jeremy Boynes 2013-09-13 15:11:29 UTC
1) CopyOnWriteArrayList would simplify managing added transformers and is intended for this use-case.

2) I would not try to de-dupe the list at all. There's no contract around equals() for transformers and this will be different from how JRE's Instrumentation manages them.

3) You're applying the transformers in findResourceInternal() which, I think, means getResource() will return transformed data. The transform should be moved into findClass() so only defined classes are transformed.

4) In the messages, we typically wrap [] around the substituted values i.e. [{0}]
Comment 19 Nick Williams 2013-09-13 15:27:36 UTC
(In reply to Jeremy Boynes from comment #18)
> 1) CopyOnWriteArrayList would simplify managing added transformers and is
> intended for this use-case.

How have I never heard of this class before? Certainly, mutation operations will definitely vastly outnumber traversal operations. Perhaps CopyOnWriteArrayList is the way to go here, instead of any locking. Mark? Thoughts?
 
> 2) I would not try to de-dupe the list at all. There's no contract around
> equals() for transformers and this will be different from how JRE's
> Instrumentation manages them.

You're correct, there's no contract for equals in ClassFileTransformer. By default a "duplicate" will be if the transformers are the exact same instance (==). Only if the person who implemented the transformer overrides Object#equals will a different behavior occur. Arguably, if the person overrides Object#equals, they likely did so to specify a different definition of "duplicate." I disagree that we shouldn't try to keep duplicates out of the list. I /would/ be open to saying that we should specifically look for duplicate instances (==) and not rely on Object#equals, in which case we need to go back to iterating over the list instead of using List#contains.

> 3) You're applying the transformers in findResourceInternal() which, I
> think, means getResource() will return transformed data. The transform
> should be moved into findClass() so only defined classes are transformed.

#findResourceInternal definitely feels like the right place for this. Re: "so only defined classes are transformed," how I've written it means only classes are transformed because the transformers are only applied if isClassResource is true. I don't see a problem here, but maybe I don't understand what you're saying.

> 4) In the messages, we typically wrap [] around the substituted values i.e.
> [{0}]

I would challenge "typically". Looks like about 50% of the time to me. If I'm being told that only [{0}] is right, I'll change it. But there is /plenty/ of code that isn't "correct" if this is the case. How are contributors to know which to use? They can't.
Comment 20 Mark Thomas 2013-09-13 17:20:59 UTC
(In reply to Nick Williams from comment #19)
> (In reply to Jeremy Boynes from comment #18)
> > 1) CopyOnWriteArrayList would simplify managing added transformers and is
> > intended for this use-case.
> 
> How have I never heard of this class before? Certainly, mutation operations
> will definitely vastly outnumber traversal operations. Perhaps
> CopyOnWriteArrayList is the way to go here, instead of any locking. Mark?
> Thoughts?

As long as the solution is reasonable and meets the requirements there won't be any objection from me. I can't speak for any other committer.

> > 4) In the messages, we typically wrap [] around the substituted values i.e.
> > [{0}]
> 
> I would challenge "typically". Looks like about 50% of the time to me. If
> I'm being told that only [{0}] is right, I'll change it. But there is
> /plenty/ of code that isn't "correct" if this is the case. How are
> contributors to know which to use? They can't.

All new code uses [{0}] for clarity on what exactly the value is (makes it easier to spot empty values and unexpected white space). We are fixing older values slowly over time.

Mark
Comment 21 Nick Williams 2013-09-14 02:30:12 UTC
Created attachment 30838 [details]
Proposed implementation of this feature

Revised patch again. Switched to using CopyOnWriteArrayList, which should be much faster than synchronization or a ReadWriteLock to protect from concurrency issues. Wrapped [] around substitution values. Fixed mergability issues due to changes to WebappClassLoader in last couple of days.

I think this is very close. Juergen has been at SpringOne all week, but he should be home in the next day or two. Hopefully he can look at this soon now that things are less stressful over there. I'd like to get all the issues settled on the Tomcat side so that, once Juergen gives it the thumbs up, it can be committed ASAP. I'd really love to get this in 8.0.RC2 and 7.0.44, if at all possible.
Comment 22 Juergen Hoeller 2013-09-26 19:05:28 UTC
The patch looks fine to me. Would be great to merge it into Tomcat proper as soon as possible.

Fortunately, since we're reflectively discovering those methods, we don't need to wait for Tomcat binaries to get Spring's part done: Based on the conventions in the patch, we'll adapt our discovery mechanism for Spring Framework 4.0 RC1, due on Oct 15th. Of course, it would be great if a corresponding Tomcat 8 release wasn't far beyond that...

Juergen
Comment 23 Mark Thomas 2013-09-26 22:57:01 UTC
The patch has been applied with a few minor tweaks to fix the various warnings Eclipse threw up. It will be included in 8.0.0-RC4 onwards. I anticipate the next 8.0.0-RCx release around (maybe even before) the Spring 4 RC1 release.
Comment 24 Nick Williams 2013-09-26 23:15:50 UTC
Excellent. Thanks! Two things:

1) InstrumentableClassLoader, as committed, says "@since 8.0, 7.0.44." At the time I submitted my most recent patch, 7.0.44 was "next." As you know, due to several recent vote cancellations, 7.0.44 was actually never released. At the very least, it should say "@since 8.0, 7.0.44." Which brings me to point 2...

2) CAN this be included in 7.0.46? I'd like it to be. It will be some time before 8.0 obtains widespread adoption, and I don't see this as a breaking change, so I believe it would be beneficial to include it in 7.0.46. If it can't be in 7.0, then InstrumentableClassLoader just needs to be changed to say "@since 8.0."
Comment 25 Nick Williams 2013-09-26 23:16:34 UTC
*** At the very least, it should say "@since 8.0, 7.0.46."
Comment 26 Violeta Georgieva 2015-07-17 08:02:02 UTC
The feature is backported to 7.0.x for 7.0.64 onwards.