Bug 52833

Summary: NPE with 7.0.26
Product: Tomcat 7 Reporter: olamy <olamy>
Component: CatalinaAssignee: Tomcat Developers Mailing List <dev>
Status: RESOLVED FIXED    
Severity: regression CC: olamy
Priority: P2    
Version: 7.0.26   
Target Milestone: ---   
Hardware: All   
OS: Mac OS X 10.4   
Attachments: attach a patch for review

Description olamy 2012-03-05 23:28:06 UTC
Upgrading the maven plugin 7.0.26, the goal which runs tomcat with a conf dir (not the embed version) generate the following NPE 
Log:
	at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:236)
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:141)
	at org.apache.catalina.startup.Catalina.start(Catalina.java:645)
	at org.apache.tomcat.maven.plugin.tomcat7.run.AbstractRunMojo.startContainer(AbstractRunMojo.java:679)
	at org.apache.tomcat.maven.plugin.tomcat7.run.AbstractRunMojo.execute(AbstractRunMojo.java:365)
	at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:101)
	at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:209)
	at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
	at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
	at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84)
	at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59)
	at org.apache.maven.lifecycle.internal.LifecycleStarter.singleThreadedBuild(LifecycleStarter.java:183)
	at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:161)
	at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:320)
	at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:156)
	at org.apache.maven.cli.MavenCli.execute(MavenCli.java:544)
	at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:197)
	at org.apache.maven.cli.MavenCli.main(MavenCli.java:141)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:290)
	at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:230)
	at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:409)
	at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:352)
Comment 1 olamy 2012-03-05 23:45:50 UTC
Created attachment 28422 [details]
attach a patch for review

simply adding a NPE check.
Comment 2 Mark Thomas 2012-03-06 14:21:25 UTC
Hmm. I'm failing to understand how the StandardServer ends up with a parent class loader that is not a child of the system class loader.

At the moment this looks like a configuration error and I am leaning towards resolving this as INVALID.
Comment 4 Mark Thomas 2012-03-09 15:09:56 UTC
I'm not exactly sure what is going on here as I can't repeat this with a simple test case. However, it does appear that r1242101 is the root cause.

I can only repeat this is I set Catalina's parent class loader to null. I guess that is what the plug-in code is doing somewhere and while in the past it has got away with it, as a result of r1242101 it is now failing.

I'm having a hard time coming up with a valid reason for setting the parent class loader to null. I am therefore leaning towards INVALID for this, although I'm not adverse to adding a null check to setParentClassLoader() and throwing an IAE if null is passed in to make this easier to debug.
Comment 5 Mark Thomas 2012-03-14 15:08:40 UTC
Without any further input, this is going to get resolved as INVALID prior to the 7.0.27 release.
Comment 6 olamy 2012-03-14 22:14:02 UTC
Sorry didn't have time to do more testing. 
I will try to find change something to make that work again with version 7.0.26.
BTW what prevent you to do a simple null check in the code ?
Comment 7 Mark Thomas 2012-03-15 09:10:52 UTC
I'm against the patch proposed since that addresses the symptom rather than the cause. I'm not against added a null check that throws an IAE at the point where the parent class loader is set to null, if you (as the user) would find that useful. However, if you don't think it would be useful, there isn't much point me adding it :)

All of this assumes that something in your code is setting the parent class loader to null. If the root cause is something else then all bets are off and we'll have to look at this aginan.
Comment 8 olamy 2012-03-17 22:06:13 UTC
What I don't understand is why in the parent classLoader chain having SystemClassLoader is mandatory!
That's what the current code assume. Frankly I don't understand why you consider that mandatory. In some env especially osgi it's not!. 
The patch just prevent a NPE in case of SystemClassLoader is not in the parent classLoader chain.
Why that could not be a valid use case ?
Comment 9 Mark Thomas 2012-03-19 20:43:53 UTC
(In reply to comment #8)
> What I don't understand is why in the parent classLoader chain having
> SystemClassLoader is mandatory!

OK.

> That's what the current code assume.

I know.

> Frankly I don't understand why you consider that mandatory.

I don't. The original developers of Tomcat did. While I can make some educated guesses why this was done based on the Servlet and J2EE specifications I can't be sure. However, since it is an assumptions that appears in multiple places across the code base I am very reluctant to change it without fully understanding the impacts that might have.

> In some env especially osgi it's not!.

OK.
 
> The patch just prevent a NPE in case of SystemClassLoader is not in the parent
> classLoader chain.

No, that patch addresses that particular symptom in one place in the code. As I stated previously, it is the root cause I am interested in fixing rather than one particular symptom.

> Why that could not be a valid use case ?

As I stated previously, I'm having a hard time coming up with a valid reason for setting the parent class loader to null.

Having looked more widely at this issue, it is clear that the implementation of Catalina#getParentClassLoader() is not consistent with other implementations. I'll get that fixed shortly. Depending on what you are doing (which you have yet to be clear about) that may resolve the problem you are seeing.

It would be helpful if you could identify which of the following cases is triggering this problem:
a) calling Catalina.setParentClassLoader() and passing null
b) calling Catalina.setParentClassLoader() and passing a class loader that does not have the system class loader as an ancestor
c) Something else.

If a) or b) any further details of why this is being done would be helpful.
If c) you'll need to provide details.
Comment 10 olamy 2012-03-19 21:40:18 UTC
>It would be helpful if you could identify which of the following cases is
>triggering this problem:
>a) calling Catalina.setParentClassLoader() and passing null
>b) calling Catalina.setParentClassLoader() and passing a class loader that does
>not have the system class loader as an ancestor
>c) Something else.

happen for b).
The c) case is not using setParentClassLoader(). I discovered my trouble using this.
BTW I saw your commit which should fix (not tested yet)
Comment 11 Konstantin Kolinko 2012-03-20 14:40:04 UTC
Maybe (In reply to comment #0)
> Upgrading the maven plugin 7.0.26, the goal which runs tomcat with a conf dir
> (not the embed version) generate the following NPE 
> Log:
>     at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:236)

1. Where is "Caused by" and its stacktrace?

Line 236 wraps other exception and rethrows it, but you are not saying what it is and where it comes from.

2. This LifecycleBase.stop() is called from LifecycleBase.start() on line 141 which is unusual and happens only if the component is already in the state of FAILED.

Thus there must be some preceding messages in the logs.
Comment 12 olamy 2012-03-20 14:57:18 UTC
20 mars 2012 15:55:11 org.apache.catalina.startup.Catalina load
GRAVE: Catalina.start
org.apache.catalina.LifecycleException: Failed to initialize component [StandardServer[2008]]
	at org.apache.catalina.util.LifecycleBase.init(LifecycleBase.java:106)
	at org.apache.catalina.startup.Catalina.load(Catalina.java:594)
	at org.apache.catalina.startup.Catalina.start(Catalina.java:633)
	at org.apache.tomcat.maven.plugin.tomcat7.run.AbstractRunMojo.startContainer(AbstractRunMojo.java:684)
	at org.apache.tomcat.maven.plugin.tomcat7.run.AbstractRunMojo.execute(AbstractRunMojo.java:372)
	at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:101)
	at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:209)
	at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
	at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
	at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84)
	at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59)
	at org.apache.maven.lifecycle.internal.LifecycleStarter.singleThreadedBuild(LifecycleStarter.java:183)
	at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:161)
	at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:320)
	at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:156)
	at org.apache.maven.cli.MavenCli.execute(MavenCli.java:544)
	at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:197)
	at org.apache.maven.cli.MavenCli.main(MavenCli.java:141)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:290)
	at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:230)
	at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:409)
	at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:352)
Caused by: java.lang.NullPointerException
	at org.apache.catalina.core.StandardServer.initInternal(StandardServer.java:810)
	at org.apache.catalina.util.LifecycleBase.init(LifecycleBase.java:102)
	... 25 more
20 mars 2012 15:55:11 org.apache.catalina.startup.Catalina load
INFO: Initialization processed in 681 ms
20 mars 2012 15:55:11 org.apache.catalina.startup.Catalina start
GRAVE: Catalina.start: 
org.apache.catalina.LifecycleException: Failed to stop component [StandardServer[2008]]
	at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:236)
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:141)
	at org.apache.catalina.startup.Catalina.start(Catalina.java:645)
	at org.apache.tomcat.maven.plugin.tomcat7.run.AbstractRunMojo.startContainer(AbstractRunMojo.java:684)
	at org.apache.tomcat.maven.plugin.tomcat7.run.AbstractRunMojo.execute(AbstractRunMojo.java:372)
	at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:101)
	at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:209)
	at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
	at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
	at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84)
	at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59)
	at org.apache.maven.lifecycle.internal.LifecycleStarter.singleThreadedBuild(LifecycleStarter.java:183)
	at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:161)
	at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:320)
	at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:156)
	at org.apache.maven.cli.MavenCli.execute(MavenCli.java:544)
	at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:197)
	at org.apache.maven.cli.MavenCli.main(MavenCli.java:141)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:290)
	at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:230)
	at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:409)
	at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:352)
Caused by: org.apache.catalina.LifecycleException: An invalid Lifecycle transition was attempted ([before_stop]) for component [org.apache.catalina.deploy.NamingResources@7f1f91ac] in state [INITIALIZED]
	at org.apache.catalina.util.LifecycleBase.invalidTransition(LifecycleBase.java:408)
	at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:219)
	at org.apache.catalina.core.StandardServer.stopInternal(StandardServer.java:756)
	at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:232)
	... 25 more
Comment 13 Konstantin Kolinko 2012-03-20 16:53:34 UTC
(In reply to comment #8)
(In reply to comment #9)

AbstractRunMojo#startContainer() creates a "new Catalina()". I do not see it ever calling setParentClassLoader() and Catalina#parentClassLoader defaults to its own classloader.

Is Catalina.class.getClassLoader() the one that does not have system CL in its parents chain?


> What I don't understand is why in the parent classLoader chain having
> SystemClassLoader is mandatory!


Anyway as far as WebappClassLoader class goes

1) It has to follow its documentation, [1] in 7.0 and in more details in 5.5, where shared class loader was actually used. There is lookup sequence,

"Therefore, from the perspective of a web application, class or resource loading looks in the following repositories, in this order: (...)" - see [1]

[1] http://tomcat.apache.org/tomcat-7.0-doc/class-loader-howto.html


2) WebappClassLoader has its own explicit call to getSystemClassLoader() and treats the system class loader specially and separately from its parent, to follow its specific delegating pattern of calling
 A system CL
 B webapp CL
 C parent CL.

Therefore for the WebappClassLoader class it does not matter where its "parent CL" is rooted. It does not impact its behaviour.

So using shared CL that does not have system CL as a parent does not seriously impact Tomcat.

I do not mind a (cl != null) check to be added in the loop in r1242101.


It seems that the code in r1242101 is the first one that actually looks into the shared classloader and is outside of a webapp. Such work is usually scoped to the web application. The shared CL is actually used in
WebappLoader#createClassLoader().

See also my comments in "Re: r1242101" on dev@.
http://markmail.org/message/rgd7tzz5zks6l5wn

(In reply to comment #12)
> 
> Caused by: org.apache.catalina.LifecycleException: An invalid Lifecycle
> transition was attempted ([before_stop]) for component
> [org.apache.catalina.deploy.NamingResources@7f1f91ac] in state [INITIALIZED]

Interesting... A small issue though. It tries to stop a component that has not been started?
Comment 14 Mark Thomas 2012-03-27 18:41:46 UTC
Given that the root cause is a class loader that does not have the 'normal' hierarchy we see in the normal (non-embedded) use case (rather than something setting the Catalina's parent class loader to null) then adding the null check is a reasonable solution. I've looked for other places in the codebase where we traverse the class loader hierarchy and don't see a need for further changes.

The fix has been applied to trunk and 7.0.x and will be included in 7.0.27 onwards.