|Summary:||NPE in Logging due to classloading|
|Product:||Tomcat 5||Reporter:||Thomas Strauss <t.strauss>|
|Component:||Catalina||Assignee:||Tomcat Developers Mailing List <dev>|
Simple War File That Demonstrates NPE Behavior on Reload
This WAR Contains a Modified Log4j That Does NOT Exhibit NPE Behavior
Description Thomas Strauss 2007-03-23 14:03:51 UTC
Actually this Bug happens also in 5.5.23. It does not happen in 5.5.9. Ok, we have a webapplication that contains each and every lib it will ever require. Due to some prepackaged JAR-files it even contains log4j and commons-logging twice. One in WEB-INF/lib and one inside a jar in WEB-INF/lib. There is a commons-logging (1.0.4) in commons/lib of tomcat, but not log4j. We use Anti-Jar|Resource-Locking on Windows. When deploying an application using the manager or the ant-tasks, everything is fine. Also removing and deploying a new version works. If we reload the application, it will refuse to reload with an NPE. I tracked the application source in the debugger and trapped the NPE. I have never seen this, but this is the fact: I can see the complete call stack and inside Log4j the function converter.format is called with a valid reference to a StringBuffer which suddenly becomes "null" in the static method PatternConverter.spacePad()! I can only explain this with the hierarchical Classloader mixing classes and beeing unable to transport the reference to the StringBuffer from the instance that is loaded in another class to the Classloader carriing the static PatternConverter class. I could "solve" this issue by putting the log4j.jar in the common/lib directory and restart the server. If I remove it from there or if I remove it completely from my application (also from jar files in my application) it works. Best regards, Thomas Stacktrace follows: Exception on reload in /manager [23 Mrz 22:01:59:139][INFO ][closing any session, if still open ][de.srs.pen.vwopenday2006.StrutsServlet] [23 Mrz 22:01:59:139][DEBUG][closeSessionSilent on sid 0 ][de.srs.pen.dao.DAOSessionFactory] [23 Mrz 22:01:59:139][DEBUG][Initializing Thread Local for refcounting on 0 ][de.srs.pen.dao.DAOSessionFactory] [23 Mrz 22:01:59:139][DEBUG][closing session, removing thread locals ][de.srs.pen.dao.DAOSessionFactory] [23 Mrz 22:01:59:139][DEBUG][Resetting refcounter on session 0 ][de.srs.pen.dao.DAOSessionFactory] [23 Mrz 22:01:59:139][DEBUG][reset counter 0 to 0 ][de.srs.pen.dao.DAOSessionFactory] [23 Mrz 22:01:59:139][DEBUG][checking for leftover transaction ][de.srs.pen.dao.DAOSessionFactory] [23 Mrz 22:01:59:139][DEBUG][Resetting refcounter on Transaction 0 ][de.srs.pen.dao.DAOSessionFactory] [23 Mrz 22:01:59:139][DEBUG][Initializing Thread Local for refcounting on 0 ][de.srs.pen.dao.DAOSessionFactory] [23 Mrz 22:01:59:139][DEBUG][reset counter 0 to 0 ][de.srs.pen.dao.DAOSessionFactory] [23 Mrz 22:01:59:139][INFO ][Destroy complete ][de.srs.pen.vwopenday2006.StrutsServlet] [23 Mrz 22:01:59:139][DEBUG][JspServlet.destroy() ][org.apache.jasper.servlet.JspServlet] [23 Mrz 22:01:59:139][DEBUG][Stopping filters ][org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/SurveyDay]] [23 Mrz 22:01:59:139][DEBUG][Stopping ][org.apache.catalina.session.ManagerBase] [23 Mrz 22:01:59:139][DEBUG][Unloading persisted sessions ][org.apache.catalina.session.ManagerBase] [23 Mrz 22:01:59:139][DEBUG][Saving persisted sessions to SESSIONS.ser ][org.apache.catalina.session.ManagerBase] [23 Mrz 22:01:59:139][DEBUG][Unloading 0 sessions ][org.apache.catalina.session.ManagerBase] [23 Mrz 22:01:59:139][DEBUG][Expiring 0 persisted sessions ][org.apache.catalina.session.ManagerBase] [23 Mrz 22:01:59:139][DEBUG][Unloading complete ][org.apache.catalina.session.ManagerBase] 23.03.2007 22:02:00 org.apache.commons.modeler.Registry registerComponent SCHWERWIEGEND: Null component Catalina:type=JspMonitor,name=jsp,WebModule=//localhost/SurveyDay,J2EEApplication=none,J2EEServer=none 23.03.2007 22:02:00 org.apache.catalina.core.ApplicationContext log SCHWERWIEGEND: HTMLManager: ManagerServlet.reload[/SurveyDay] java.lang.NullPointerException at org.apache.log4j.helpers.PatternConverter.spacePad(PatternConverter.java:92) at org.apache.log4j.helpers.PatternConverter.format(PatternConverter.java:71) at org.apache.log4j.PatternLayout.format(PatternLayout.java:495) at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:292) at org.apache.log4j.WriterAppender.append(WriterAppender.java:150) at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:221) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:57) at org.apache.log4j.Category.callAppenders(Category.java:187) at org.apache.log4j.Category.forcedLog(Category.java:372) at org.apache.log4j.Category.log(Category.java:864) at org.apache.commons.logging.impl.Log4JLogger.debug(Log4JLogger.java:110) at org.apache.catalina.session.ManagerBase.init(ManagerBase.java:723) at org.apache.catalina.session.StandardManager.start(StandardManager.java:619) at org.apache.catalina.core.StandardContext.start(StandardContext.java:4166) at org.apache.catalina.core.StandardContext.reload(StandardContext.java:3025) at org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:907) at org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:488) at org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:99) at javax.servlet.http.HttpServlet.service(HttpServlet.java:690) at javax.servlet.http.HttpServlet.service(HttpServlet.java:803) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:269) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:210) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:174) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:151) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:870) at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665) at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528) at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81) at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:685) at java.lang.Thread.run(Unknown Source) 23.03.2007 22:02:00 org.apache.catalina.core.ApplicationContext log INFO: HTMLManager: list: Listing contexts for virtual host 'localhost'
Comment 1 Yoav Shapira 2007-03-25 13:51:48 UTC
I doublt we should adjust very well-tested classloader because your packaging is messed up. It's your responsibility (or the responsibility of whoever packages your webapp) to not include duplicate and/or conflicting versions of its dependent libraries. If you can't do this, at the very least you will want to separate Tomcat's logging libs from your own, perhaps by moving the commons-logging stuff from $CATALINA_HOME/common/lib to $CATALINA_HOME/server/lib.
Comment 2 Thomas Strauss 2007-03-26 03:42:08 UTC
Then: Why does the very same package deploy on 5.5.9? And: Yes there are duplicate classes in the packages, but in the end you will never be able to avoid it. And: There is the, mabe even written law, that it is definitely good practice that one should package all required classes inside the web-app\lib path, besides native lib adapters. If this is suddenly not true any more, then how do you deploy different independent applications with incompatible commons-logging implementations? The answer may be easy for you, but I think that I have done valid packaging and that the classloader is in fact mixing packages from /common/lib with packages from my application web-inf/lib on application reload. It looks like it is preferring the common/lib instance of the jar on reloading, not recognizing, that there is a jar in WEB-IN/lib carrying the same class. Following the rules of classloading, the Web-App's jar should be used. But you are free to correct me if I am wrong.
Comment 3 Robert Weber 2007-03-26 06:38:42 UTC
This bug should not be dismissed. My own web application includes one instance each of commons-logging-1.1.jar and log4j-1.2.14.jar. My $CATALINA_HOME/common/lib does not contain a commons-logging jar (unless the class files are embedded in one of the other jars???). Additionally, the NPE is thrown by log4j code. In fact, I have done some debugging on the log4j code-base, and the behavior that I witnessed defied standard java convention. That is, the NPE is caused by certain static class fields (whose values are set at instantiation) arbitrarily becoming null when one attempts to reload the context that contains the log4j jar. In effect, certain static members of the log4j classes are not being re-initialized when the context is reloaded. For my part, I have worked around the issue by patching my log4j to re-initialize the problem fields itself when they are detected to be null, but in all honesty, this hack should be unnecessary. Now, I would not say that this bug is a number one priority show-stopper (or even that it should be fixed), but it should, at least, be acknowledged.
Comment 4 Shankar Unni 2007-03-26 17:12:10 UTC
You do remember that when you undeploy an application (while restarting it), the tomcat classloader reflects on all the classes it's unloading, and resets any static members to null. Depending on the order in which it does this, if there are some other threads still running, they can get NPEs..
Comment 5 Robert Weber 2007-03-27 02:31:56 UTC
Why should static members be reset to null? When static members are reset to null, do they eventually get re-initialized? For instance if I declare: static String foo = "bar"; When the context re-load completes, does "bar".equals(foo) evaluate to true?
Comment 6 Thomas Strauss 2007-03-27 02:34:30 UTC
I have no additional threads running in this application. Besides that, in all applications we manage, the running threads are stopped on destroy(). But neverthelesse: I have started to move JARs around and have now the following situation: Like Robert Weber, I have no Log4J and no commons-logging in the commons/lib or in shared/lib. I have a logging.properties in commons/classes. I have commons-loggings and log4j only ONCE in the webapplication this is tested with. UNloading works fine, as soon as reloading starts, the NPE is thrown. This time it is not even inside the application, but in Tomcat Manager. I guess, that the class in log4j is already distorted. Could it be that tomcat or commons-logging thinks it should use log4j because the classes are found inside the application?
Comment 7 Mark Thomas 2007-04-09 16:36:38 UTC
The issue is that something is keeping a reference to a logger that it shouldn't. This is going to cause a memory leak and what you are seeing are the side-effects of Tomcat's attempts to prevent memory leaks. If you can provide the simplest possible test war the exhibits this issue I'll happily debug my way though the code and figure out what is going wrong. Without such a test case, there is little that can be done to investigate this issue.
Comment 8 Robert Weber 2007-04-19 06:53:34 UTC
Created attachment 20001 [details] Simple War File That Demonstrates NPE Behavior on Reload I deployed this WAR on the 5.5.20 and 5.5.23 binary distributions. This WAR contains a simple servlet that contains a static reference to a Log instance. The vanilla binary of log4j 1.2.14 is on the classpath, so commons-logging is using the log4j log provider. The servlet runs as expected (it just prints a message to the client and prints the same message in the log). However, when I click on the reload link for the application in the manager web application, the context fails to reload with a NullPointerException. The NullPointerException traces back to log4j classes. Specifically, I discovered that there are several instances where static members of the log4j classes in question are improperly set to null by something outside of log4j and not re-initialized.
Comment 9 Robert Weber 2007-04-19 06:59:22 UTC
Created attachment 20002 [details] This WAR Contains a Modified Log4j That Does NOT Exhibit NPE Behavior This WAR is identical to "20001: Simple War File That Demonstrates NPE Behavior on Reload" except that it contains a version of log4j that I modified. Recall that I stated that there are several static members in log4j classes that are being improperly nullified and not re-initialized. The modifications that I made to the log4j binary included in this WAR simply amount to checking if the members in question are in a null state before using them. Then, if they are null, I re-initialize the members to their starting values. This WAR does NOT exhibit a NullPointerException when reloaded. Additionally, the servlet continues to function normally after several reloads.
Comment 10 Remy Maucherat 2007-04-19 07:03:42 UTC
Set the system property org.apache.catalina.loader.WebappClassLoader.ENABLE_CLEAR_REFERENCES to false.
Comment 11 Robert Weber 2007-04-19 07:11:09 UTC
I attached two test cases. The first contains a vanilla binary of log4j 1.2.14, and it does exhibit the NPE behavior. The second contains a modified binary of log4j 1.2.14, and it does NOT exhibit the NPE behavior. More details regarding the test cases can be found by clicking on the attachment links. Two further notes, though: 1.) The NPE behavior is exhibited in the first test case regardless of whether the servlet containing the log reference has been previously invoked. 2.) The first test case does NOT exhibit the NPE behavior under tomcat 6.0.10. I suspect that this has something to do with the standardization of logging under that tomcat platform.(In reply to comment #10) > Set the system property > org.apache.catalina.loader.WebappClassLoader.ENABLE_CLEAR_REFERENCES to false. Why?
Comment 12 Robert Weber 2007-04-19 08:00:40 UTC
(In reply to comment #10) > Set the system property > org.apache.catalina.loader.WebappClassLoader.ENABLE_CLEAR_REFERENCES to false. I set this property in $JAVA_OPTS to false. I additionally verified that the property was, in fact, being set by querying System.getProperty(). However, setting this property to false has no effect with regards to resolving this bug.
Comment 13 Remy Maucherat 2007-04-19 08:05:05 UTC
Ok, you did not post the NPE, so I cannot know for sure. Regardless, this issue is not going to be addressed, and you should probably be packaging your web application differently.
Comment 14 Robert Weber 2007-04-19 08:29:36 UTC
(In reply to comment #13) > Ok, you did not post the NPE, so I cannot know for sure. Regardless, this issue > is not going to be addressed, and you should probably be packaging your web > application differently. There is little variation between my NPE and the one that has been posted previously. By "packaging your web application differently", what exactly do you mean? Placing log4j in the container classpath instead of the application classpath? Not using a separately packaged commons-logging? Not using log4j at all?
Comment 15 Mark Thomas 2007-04-19 09:14:10 UTC
org.apache.catalina.loader.WebappClassLoader.ENABLE_CLEAR_REFERENCES only has an effect in the 6.0.x branch. This option hasn't been ported to the 5.5.x branch.
Comment 16 James McIntosh 2007-04-19 16:49:36 UTC
I am hunting for permgen leaks as the moment and have wildly inconsistent results with context reloading. Most of the time the first loaded webapp will not garbage but some subsequent reloads are grabaged. The Tomcat FAQ section should have it spelled out under a title called 'Log4J' the best way to include log4j and the implications of having the commons-logging. Currently the 'How should I log in my own webapps?' section 'strongly recommends' using log4j but does not address any of these issues. There are many different approaches listed on the web some of which go as far as replacing the bootstrap logging. e.g. /commons/lib/commons-logging.jar /commons/lib/log4j.jar /commons/classes/log4j.properties /<APP>/WEB-INF/lib/log4j.jar /<APP>/WEB-INF/lib/commons-logging.jar /<APP>/WEB-INF/classes/log4j.properties It should spell out things like the following: If you have commons-logging.jar in your webapp then it will not garbage. If you don't have log4j.jar in your webapp then you will not get per-content logging of you /<APP>/classes/log4j.properties file Kind regards James McIntosh
Comment 17 Mark Thomas 2007-04-19 19:59:47 UTC
I am re-opening this since the bug is the result of the nulling out static and final fields from classes loaded by the web app classloader. As noted in the code comments this is a workaround for some apparent GC bugs. The workaround will not be removed since it is useful for many users and has been shown to resolve some memory leak issues. I will make this workaround optional in TC5.5.x, as it is in TC6.0.x.
Comment 18 Mark Thomas 2007-04-19 20:00:53 UTC
*** Bug 37956 has been marked as a duplicate of this bug. ***
Comment 19 Mark Thomas 2007-04-19 20:20:27 UTC
I have added this configuration option to svn. The provided test case no longer throws an NPE on reload, providing the new configuration option is set to false. This will be included in 5.5.24 onwards.
Comment 20 Mark Thomas 2007-04-22 17:34:38 UTC
Note that after some further investigation the root cause is actually a logging related memory leak (bug 42172). This leak has been fixed in svn and the test case for this issue now runs without an NPE with or without setting the system property org.apache.catalina.loader.WebappClassLoader.ENABLE_CLEAR_REFERENCES to false.
Comment 21 Mark Thomas 2007-04-23 19:43:08 UTC
Having looked at this in more detail there is still a theoretical possibility of an NPE unless the system property org.apache.catalina.loader.WebappClassLoader.ENABLE_CLEAR_REFERENCES is set to false Any such occurrence should be rare.
Comment 22 Aaron Digulla 2007-11-02 13:27:46 UTC
I'm wary of the implications of the comments in this bug. Let me summarize to make sure we're all on the same level. I'm coming from bug 40212, btw. - To get per-app logging, one has to use the setup of comment #16 (one instance of log4j and commons-logging per tomcat and per webapp). - If you do this, you can get NPEs at the next startup because Tomcat clears static fields during undeploy. Can you explain to me how it can be possible that the instance N+1 of my webapp can see anything from instance N? If that would happen, it would either mean a) that tomcat is handing me classes which are in the common classpath despite them being overridden in my webapp b) that I get references to classes which ought no to exist anymore a) means that classes from tomcat leak into my webapp and b) means that classes from instance N can leak into N+1. Neither must be possible.
Comment 23 Dan Armbrust 2008-07-09 13:40:47 UTC
Could someone with knowledge of this bug read through this comment on a bug in Log4j? https://issues.apache.org/bugzilla/show_bug.cgi?id=43867#c39 I believe that this is an example of tomcat breaking other code due to the way that it clears references, when org.apache.catalina.loader.WebappClassLoader.ENABLE_CLEAR_REFERENCES is left to its default of "on". In this case, log4j prevents a null pointer, and simply logs an error. But other code could very easily take a null pointer for the same reason. It doesn't seem like the current implementation of clear references is safe to use. Comment #17 indicates that clearReferences is a workaround to a different bug, and won't be removed. Perhaps that should be re-evaluated. I now have to disable clearReferences on my tomcat installs on most of my webapps which use log4j to prevent users from seeing a scary looking error being logged by log4j.
Comment 24 Mark Thomas 2008-07-09 14:01:42 UTC
Yes, if you and the libraries you use code their use of loggers correctly you should be safe with setting ENABLE_CLEAR_REFERENCES to false. I can't speak for the other developers but I am happy with the current default for this being true on the basis it fixes more issues than it causes. That said, I am more than happy to look into any test case (like bug 42172) that could be a Tomcat memory leak.