We have got a multi-threaded enterprise application running on highly loaded weblogic server. We are getting this below error very frequently and causing our production server to go down. Pls see below Thread trace : "ExecuteThread: '33' for queue: 'weblogic.kernel.Default'" daemon prio=5 tid=0x30a8f0 nid=0x2d waiting for monitor entry [8fcfe000..8fd019bc] at org.apache.log4j.RollingFileAppender.rollOver (RollingFileAppender.java:126) at org.apache.log4j.RollingFileAppender.subAppend (RollingFileAppender.java:228) at org.apache.log4j.WriterAppender.append(WriterAppender.java:150) at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:221) - locked <b71560a8> (a org.apache.log4j.RollingFileAppender) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders (AppenderAttachableImpl.java:57) at org.apache.log4j.Category.callAppenders(Category.java:187) - locked <b715d2d0> (a org.apache.log4j.Logger) at org.apache.log4j.Category.forcedLog(Category.java:372) Thread dump 1/4 " Thread-25402 <threadFilter.do?mode=ModeAllThreads&FullThreadIndex=0>" prio=5 tid=0x14f92d0 nid=0x638b waiting for monitor entry [8a301000..8a3019bc] at org.apache.log4j.Category.callAppenders(Category.java:185) - waiting to lock <b715d2d0> (a org.apache.log4j.Logger) at org.apache.log4j.Category.forcedLog(Category.java:372) at org.apache.log4j.Category.log(Category.java:864) Is there any know solution/patch for this error? Thanks Srinivas
*** Bug 41215 has been marked as a duplicate of this bug. ***
Hello Srinivas, Which exact version of log4j are you using? Which JDK?
Hello Ceki, We are currently using log4j which is of two to 3 years old(don't have source code and not sure how to check version of log4j). JDK version is 1.3.1 Thanks, Srinivas
From log4j.dtd i can see the version as 1.2 <?xml version="1.0" encoding="UTF-8" ?> <!-- Authors: Chris Taylor, Ceki Gulcu. --> <!-- Version: 1.2 -->
It must be an early version in the 1.2 series. What is the size in bytes of the log4j.jar file? By the way, the log4j jar file usually contains the version number in its name.
The size in bytes is 352,668 bytes. It looks like my earlier team here has just renamed the file to log4.jar.
The file size is the same as log4j-1.2.8.jar, however the line number from RollingFileAppender does not correspond to any synchronization block in that version (the link number from Category does). If you could do a "unzip -l log4j.jar", you should see the timestamps within the jar file. log4j-1.2.8.jar would have timestamps of 2003-02-20. If you extract the members of log4j-1.2.8.jar ("jar xf log4j-1.2.8.jar"), the META-INF/MANIFEST.MF file would have an Implementation-Version of 1.2.8. I've reviewed the subversion history for RollingFileAppender in the v1_2-branch. Line 126 falls within RFA.rollOver in revisions 309239 (2001-07-20) and later. The implementation of RFA has not been substantially changed in that period and that line number corresponds to either a blank line, a File constructor or a File.exists call, none of which should acquire a lock on a category.
I just unzipped the jar and find the timestamps as : 02/20/2003 Here's the content of Manifest file: Manifest-Version: 1.0 Created-By: Apache Ant 1.5.1 Name: org/apache/log4j/ Implementation-Title: log4j Implementation-Version: 1.2.8 Implementation-Vendor: "Apache Software Foundation"
I could not find the 1.2.8 distribution via log4j's binary download page [1]. It appears that log4j versions prior to log4j 1.2.9 are no longer available. However, looking at the source code as available on Subversion[2], it is not easy to identify the code holding locks. Line 126 of RollingFileAppender reads: target = new File(fileName + '.' + (i + 1)); Is it possible that the File constructors tries to access a lock and gets blocked? It may be a JDK 1.3 problem more than a log4j bug. [1] http://logging.apache.org/site/binindex.cgi [2] http://svn.apache.org/viewvc/logging/log4j/tags/v1_2_7/src/java/org/apache/log4j/RollingFileAppender.java?annotate=309639
Do you think this issue will get resolved, if i update the existing version of log4j to latest version. Thanks, Srinivas
I am sorry to say that given the difficulty to identify the problem, there is no guarantee that upgrading will resolve the matter. Having said that, upgrading might help, either by solving the problem or by helping us identify it.
Pls let me know if you want me to send the log4j binary file that we were using in our application. Thanks, Srinivas
Btw, the 1.2.8 log4j download seems to be linked at the bottom of the page ('earlier versions') -> http://archive.apache.org/dist/logging/log4j/ . It's MD5 is 18a4ca847248e5b8606325684342701c. Your manifest matches the 1.2.8 manifest. Line 185 of Category synchronizes on the Category and then its parents (in the first stack... this ends up in RollingFileAppender), and as Ceki said, appears to be blocked in the File call. The second stack is waiting to enter that lock. Is it always the same stack trace blocked on line 126 of RollingFileAppender? If so, upgrading the JVM (or switching from Sun to IBM or IBM to Sun) might be of value - though I suspect you're constrained by the vendor's versioning. Knowing the platform and vendor of the JVM might be of value to anyone else who hits the problem, but I suspect there's little log4j could be doing in this situation.
I have also seen this happen at least once. A thread just sits in the writeBytes call: at java.io.FileOutputStream.writeBytes(Native Method) at java.io.FileOutputStream.write(FileOutputStream.java:260) at sun.nio.cs.StreamEncoder$CharsetSE.writeBytes(StreamEncoder.java:336) at sun.nio.cs.StreamEncoder$CharsetSE.implFlushBuffer(StreamEncoder.java:404) at sun.nio.cs.StreamEncoder$CharsetSE.implFlush(StreamEncoder.java:408) at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:152) at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:213) at org.apache.log4j.helpers.QuietWriter.flush(QuietWriter.java:57) at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:315) at org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:236) at org.apache.log4j.WriterAppender.append(WriterAppender.java:159) at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:230) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders (AppenderAttachableImpl.java:65) at org.apache.log4j.Category.callAppenders(Category.java:203) at org.apache.log4j.Category.forcedLog(Category.java:388) at org.apache.log4j.Category.log(Category.java:835) at com.scalent.shared.util.Logger.log(Logger.java:352) I am using: log4j 1.2.14 $ java -v java version "1.5.0_08" Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_08-b03) Java HotSpot(TM) Server VM (build 1.5.0_08-b03, mixed mode) $ uname -a SunOS build-sparc10 5.10 Generic sun4u sparc SUNW,Sun-Fire-V210
I'm very confused about what could be causing this, but here's some more information that I have: * We run the same code with the same JVM version on Linux 2.6, Solaris (Sparc/x86), and Windows. * We see these problems only on Solaris machines (Sparc and x86). * We have hundreds of unit tests, and they all using logging a lot. In several consecutive runs, the exact same logging statement as described in Additional Comment #14 caused the deadlock, not any other of the thousands of logging statements. * I see one other case during this test run of a deadlock, which is described below. * Ocassionally I will see a similar deadlock in a completely different piece of code that has nothing to do with log4j. A thread will be sitting in Object.wait(), when I know it must have had Object.notify() delivered to it. I have never seen this happen with the same unit test on Linux/Windows. This unit test runs many times a day on all our platforms. This makes me think this problem is a very rare bug in Solaris JVM or Solaris itself, that is triggered by a specific sequence of events in our code. Although that seems unlikely... Here's another instance of our "deadlock": at org.apache.log4j.helpers.CountingQuietWriter.write(CountingQuietWriter.java:45) at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:301) at org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:236) at org.apache.log4j.WriterAppender.append(WriterAppender.java:159) at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:230) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders (AppenderAttachableImpl.java:65) at org.apache.log4j.Category.callAppenders(Category.java:203) at org.apache.log4j.Category.forcedLog(Category.java:388) at org.apache.log4j.Category.log(Category.java:835) at com.scalent.shared.util.Logger.log(Logger.java:352)
*** Bug 42213 has been marked as a duplicate of this bug. ***
We yesterday had - on a linux box - the same situation: java -version = java version "1.5.0_11" Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_11-b03) Java HotSpot(TM) Server VM (build 1.5.0_11-b03, mixed mode) log4J Version is 1.2.8 2007-10-17 14:37:14,207 ERROR [net.portrix.powerdial.adapters.ceemes.CeemesListener] CeemesListener timeout/deadlock Stacktraces: "OpenCallGroup Thread" prio=5 Thread id=36 BLOCKED org.apache.log4j.Category.callAppenders(Category.java:185) org.apache.log4j.Category.forcedLog(Category.java:372) org.apache.log4j.Category.debug(Category.java:241) net.portrix.powerdial.adapters.asterisk.AsteriskCall.setState(AsteriskCall.java:216) net.portrix.powerdial.adapters.asterisk.AsteriskCall.eventStart(AsteriskCall.java:51) net.portrix.powerdial.adapters.asterisk.AsteriskCallManager.startCall(AsteriskCallManager.java:220) -- 2007-10-17 14:37:14,207 ERROR [net.portrix.powerdial.adapters.ceemes.CeemesListener] CeemesListener timeout/deadlock Stacktraces: "ManagerConnection-0-Reader-0" daemon prio=5 Thread id=30 BLOCKED org.apache.log4j.Category.callAppenders(Category.java:185) org.apache.log4j.Category.forcedLog(Category.java:372) org.apache.log4j.Category.debug(Category.java:241) net.portrix.powerdial.adapters.asterisk.AsteriskCallManager.onManagerEvent(AsteriskCallManager.java:873) org.asteriskjava.manager.internal.ManagerConnectionImpl.fireEvent(ManagerConnectionImpl.java:1121) org.asteriskjava.manager.internal.ManagerConnectionImpl.dispatchEvent(ManagerConnectionImpl.java:1105)
Found one Java-level deadlock: ============================= "http-8080-8": waiting to lock monitor 0x0809e77c (object 0x7b8aff60, a org.apache.log4j.Logger), which is held by "pool-1-thread-6" "pool-1-thread-6": waiting to lock monitor 0x0809e6bc (object 0x7b8a2258, a org.apache.log4j.RollingFileAppender), which is held by "http-8080-8" Java stack information for the threads listed above: =================================================== "http-8080-8": at org.apache.log4j.Category.callAppenders(Category.java:204) - waiting to lock <0x7b8aff60> (a org.apache.log4j.Logger) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.log(Category.java:856) at org.apache.commons.logging.impl.Log4JLogger.debug(Log4JLogger.java:110) at org.hibernate.jdbc.AbstractBatcher.log(AbstractBatcher.java:401) at org.hibernate.jdbc.AbstractBatcher.getPreparedStatement(AbstractBatcher.java:482) at org.hibernate.jdbc.AbstractBatcher.getPreparedStatement(AbstractBatcher.java:423) at org.hibernate.jdbc.AbstractBatcher.prepareQueryStatement(AbstractBatcher.java:139) at org.hibernate.loader.Loader.prepareQueryStatement(Loader.java:1547) at org.hibernate.loader.Loader.doQuery(Loader.java:673) at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:236) at org.hibernate.loader.Loader.loadCollection(Loader.java:1994) at org.hibernate.loader.collection.CollectionLoader.initialize(CollectionLoader.java:36) at org.hibernate.persister.collection.AbstractCollectionPersister.initialize(AbstractCollectionPersister.java:565) at org.hibernate.event.def.DefaultInitializeCollectionEventListener.onInitializeCollection(DefaultInitializeCollectionEventListener.java:60) at org.hibernate.impl.SessionImpl.initializeCollection(SessionImpl.java:1716) at org.hibernate.collection.AbstractPersistentCollection.initialize(AbstractPersistentCollection.java:344) at org.hibernate.collection.AbstractPersistentCollection.read(AbstractPersistentCollection.java:86) at org.hibernate.collection.PersistentSet.toString(PersistentSet.java:309) at java.lang.String.valueOf(String.java:2615) at java.lang.StringBuilder.append(StringBuilder.java:116) at com.company.ws.product.services.commons.User.toString(User.java:314) at sun.reflect.GeneratedMethodAccessor1319.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:585) at org.hibernate.proxy.pojo.cglib.CGLIBLazyInitializer.invoke(CGLIBLazyInitializer.java:157) at com.company.ws.product.services.commons.User$$EnhancerByCGLIB$$9ab22e0e.toString(<generated>) at java.lang.String.valueOf(String.java:2615) at java.lang.StringBuilder.append(StringBuilder.java:116) at com.company.ws.product.criterion.SimpleExpression.toString(SimpleExpression.java:125) at java.lang.String.valueOf(String.java:2615) at java.util.AbstractCollection.toString(AbstractCollection.java:454) at java.lang.String.valueOf(String.java:2615) at java.lang.StringBuilder.append(StringBuilder.java:116) at com.company.ws.product.SearchCriteria.toString(SearchCriteria.java:377) at org.apache.log4j.or.DefaultRenderer.doRender(DefaultRenderer.java:36) at org.apache.log4j.or.RendererMap.findAndRender(RendererMap.java:80) at org.apache.log4j.spi.LoggingEvent.getRenderedMessage(LoggingEvent.java:362) at org.apache.log4j.helpers.PatternParser$BasicPatternConverter.convert(PatternParser.java:403) at org.apache.log4j.helpers.PatternConverter.format(PatternConverter.java:65) at org.apache.log4j.PatternLayout.format(PatternLayout.java:502) at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:302) at org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:263) at org.apache.log4j.WriterAppender.append(WriterAppender.java:160) at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251) - locked <0x7b8a2258> (a org.apache.log4j.RollingFileAppender) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) at org.apache.log4j.Category.callAppenders(Category.java:206) - locked <0x7b8af630> (a org.apache.log4j.Logger) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Logger.trace(Logger.java:172) at com.company.ws.product.HibernateUtil.getHibernateCriteria(HibernateUtil.java:611) at com.company.ws.product.HibernateUtil.get(HibernateUtil.java:1075) at com.company.ws.product.HibernateUtil.get(HibernateUtil.java:1028) at com.company.ws.product.services.callscreening.CallScreeningManagerImpl.get(CallScreeningManagerImpl.java:58) at com.company.ws.product.services.location.engine.LocationTargetsMgr.getExtensionLocationDialPlan(LocationTargetsMgr.java:686) at com.company.ws.product.services.location.engine.LocationTargetsMgr.getExtensionLocationEntity(LocationTargetsMgr.java:457) at com.company.ws.product.services.location.engine.LocationTargetsMgr.getLocationTargets(LocationTargetsMgr.java:224) at com.company.ws.product.services.location.engine.LocationEngineServiceImpl.getLocationTargets(LocationEngineServiceImpl.java:59) at sun.reflect.GeneratedMethodAccessor1128.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:585) at com.sun.xml.ws.api.server.InstanceResolver$1.invoke(InstanceResolver.java:246) at com.sun.xml.ws.server.InvokerTube$2.invoke(InvokerTube.java:146) at com.sun.xml.ws.server.sei.EndpointMethodHandler.invoke(EndpointMethodHandler.java:257) at com.sun.xml.ws.server.sei.SEIInvokerTube.processRequest(SEIInvokerTube.java:93) at com.sun.xml.ws.api.pipe.Fiber.__doRun(Fiber.java:595) at com.sun.xml.ws.api.pipe.Fiber._doRun(Fiber.java:554) at com.sun.xml.ws.api.pipe.Fiber.doRun(Fiber.java:539) at com.sun.xml.ws.api.pipe.Fiber.runSync(Fiber.java:436) - locked <0x7f5ce188> (a com.sun.xml.ws.api.pipe.Fiber) at com.sun.xml.ws.server.WSEndpointImpl$2.process(WSEndpointImpl.java:243) at com.sun.xml.ws.transport.http.HttpAdapter$HttpToolkit.handle(HttpAdapter.java:444) at com.sun.xml.ws.transport.http.HttpAdapter.handle(HttpAdapter.java:244) at com.sun.xml.ws.transport.http.servlet.ServletAdapter.handle(ServletAdapter.java:135) at com.sun.xml.ws.transport.http.servlet.WSServletDelegate.doGet(WSServletDelegate.java:129) at com.sun.xml.ws.transport.http.servlet.WSServletDelegate.doPost(WSServletDelegate.java:160) at com.sun.xml.ws.transport.http.servlet.WSServlet.doPost(WSServlet.java:75) at javax.servlet.http.HttpServlet.service(HttpServlet.java:710) at javax.servlet.http.HttpServlet.service(HttpServlet.java:803) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:584) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447) at java.lang.Thread.run(Thread.java:595) "pool-1-thread-6": at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:231) - waiting to lock <0x7b8a2258> (a org.apache.log4j.RollingFileAppender) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) at org.apache.log4j.Category.callAppenders(Category.java:206) - locked <0x7b8aff60> (a org.apache.log4j.Logger) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.log(Category.java:856) at org.apache.commons.logging.impl.Log4JLogger.debug(Log4JLogger.java:110) at org.hibernate.jdbc.AbstractBatcher.log(AbstractBatcher.java:401) at org.hibernate.jdbc.AbstractBatcher.getPreparedStatement(AbstractBatcher.java:482) at org.hibernate.jdbc.AbstractBatcher.getPreparedStatement(AbstractBatcher.java:423) at org.hibernate.jdbc.AbstractBatcher.prepareQueryStatement(AbstractBatcher.java:139) at org.hibernate.loader.Loader.prepareQueryStatement(Loader.java:1547) at org.hibernate.loader.Loader.doQuery(Loader.java:673) at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:236) at org.hibernate.loader.Loader.doList(Loader.java:2220) at org.hibernate.loader.Loader.listUsingQueryCache(Loader.java:2136) at org.hibernate.loader.Loader.list(Loader.java:2096) at org.hibernate.loader.criteria.CriteriaLoader.list(CriteriaLoader.java:94) at org.hibernate.impl.SessionImpl.list(SessionImpl.java:1569) at org.hibernate.impl.CriteriaImpl.list(CriteriaImpl.java:283) at com.company.ws.product.HibernateUtil.get(HibernateUtil.java:1082) at com.company.ws.product.services.callmonitor.CallMonitorManagerImpl.getCallMonEntries(CallMonitorManagerImpl.java:89) at com.company.ws.product.services.callmonitor.CallMonitorManagerImpl.locateEntry(CallMonitorManagerImpl.java:209) at com.company.ws.product.services.callmonitor.CallMonitorManagerImpl.handleCallUpdatedEvent(CallMonitorManagerImpl.java:184) at com.company.product.CallStateTransactionalHandlerList.handleCallUpdatedEvent(CallStateTransactionalHandlerList.java:95) at com.company.product.CallUpdateSubJob.call(CallUpdateSubJob.java:46) at com.company.product.CallUpdateSubJob.call(CallUpdateSubJob.java:18) at com.company.mvutil.collection.MultiPartJob.executeSubJob(MultiPartJob.java:114) at com.company.mvutil.collection.MultiPartJob.executeSubJobs(MultiPartJob.java:96) at com.company.mvutil.collection.MultiPartJob.run(MultiPartJob.java:60) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675) at java.lang.Thread.run(Thread.java:595) at com.company.product.CallStateDispatcher$SessionedThread.run(CallStateDispatcher.java:313) Found 1 deadlock. root /opt/tomcat/logs > /usr/java/j2se5/bin/java -version java version "1.5.0_12" Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_12-b04) Java HotSpot(TM) Client VM (build 1.5.0_12-b04, mixed mode) root@qamd2 /opt/tomcat/logs > uname -a Linux qamd2 2.6.16.33-xen #1 SMP Thu Mar 22 12:04:09 IST 2007 x86_64 x86_64 x86_64 GNU/Linux root@qamd2 /opt/tomcat/logs >
We have the same deadlock issue under windows XP and log4j-1.2.15. Is there a workaround? Is it specific to using the AsyncAppender with RollingFileAppender? "movementMDPListenerContainer-1" prio=6 tid=0x5edfa8d0 nid=0x1bcc waiting for monitor entry [0x60c5d000..0x60c5fa18] at org.apache.log4j.Category.callAppenders(Category.java:185) - waiting to lock <0x13b78530> (a org.apache.log4j.spi.RootCategory) at org.apache.log4j.Category.forcedLog(Category.java:372) at org.apache.log4j.Category.log(Category.java:864) at sun.reflect.GeneratedMethodAccessor39.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:585) at org.apache.commons.logging.impl.Log4jProxy.log(Log4jProxy.java:288) at org.apache.commons.logging.impl.Log4jProxy.debug(Log4jProxy.java:235) at org.apache.commons.logging.impl.Log4JLogger.debug(Log4JLogger.java:84) at org.hibernate.jdbc.AbstractBatcher.log(AbstractBatcher.java:393) at org.hibernate.jdbc.AbstractBatcher.getPreparedStatement(AbstractBatcher.java:474) at org.hibernate.jdbc.AbstractBatcher.getPreparedStatement(AbstractBatcher.java:415) at org.hibernate.jdbc.AbstractBatcher.prepareQueryStatement(AbstractBatcher.java:139) at org.hibernate.loader.Loader.prepareQueryStatement(Loader.java:1560) at org.hibernate.loader.Loader.doQuery(Loader.java:661) at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:224) at org.hibernate.loader.Loader.doList(Loader.java:2144) at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2028) at org.hibernate.loader.Loader.list(Loader.java:2023) at org.hibernate.loader.criteria.CriteriaLoader.list(CriteriaLoader.java:95) at org.hibernate.impl.SessionImpl.list(SessionImpl.java:1569)
Sorry I forgot to post the other thread: I believe that there is a conflict in the use of synchronized between the AsyncAppender.append and the Category.callAppenders; Any solution available??? "movementMDPListenerContainer-2" prio=6 tid=0x5f1efe28 nid=0x1f8c in Object.wait() [0x60b1f000..0x60b1fc98] at java.lang.Object.wait(Native Method) - waiting on <0x13b80ce8> (a org.apache.log4j.helpers.BoundedFIFO) at java.lang.Object.wait(Object.java:474) at org.apache.log4j.AsyncAppender.append(AsyncAppender.java:85) - locked <0x13b80ce8> (a org.apache.log4j.helpers.BoundedFIFO) at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:221) - locked <0x13b65418> (a org.apache.log4j.AsyncAppender) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:57) at org.apache.log4j.Category.callAppenders(Category.java:187) - locked <0x13b78530> (a org.apache.log4j.spi.RootCategory) at org.apache.log4j.Category.forcedLog(Category.java:372) at org.apache.log4j.Category.info(Category.java:674) (In reply to comment #19) > We have the same deadlock issue under windows XP and log4j-1.2.15. Is there a > workaround? Is it specific to using the AsyncAppender with RollingFileAppender? > > "movementMDPListenerContainer-1" prio=6 tid=0x5edfa8d0 nid=0x1bcc waiting for > monitor entry [0x60c5d000..0x60c5fa18] > at org.apache.log4j.Category.callAppenders(Category.java:185) > - waiting to lock <0x13b78530> (a org.apache.log4j.spi.RootCategory) > at org.apache.log4j.Category.forcedLog(Category.java:372) > at org.apache.log4j.Category.log(Category.java:864) > at sun.reflect.GeneratedMethodAccessor39.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:585) > at org.apache.commons.logging.impl.Log4jProxy.log(Log4jProxy.java:288) > at org.apache.commons.logging.impl.Log4jProxy.debug(Log4jProxy.java:235) > at org.apache.commons.logging.impl.Log4JLogger.debug(Log4JLogger.java:84) > at org.hibernate.jdbc.AbstractBatcher.log(AbstractBatcher.java:393) > at > org.hibernate.jdbc.AbstractBatcher.getPreparedStatement(AbstractBatcher.java:474) > at > org.hibernate.jdbc.AbstractBatcher.getPreparedStatement(AbstractBatcher.java:415) > at > org.hibernate.jdbc.AbstractBatcher.prepareQueryStatement(AbstractBatcher.java:139) > at org.hibernate.loader.Loader.prepareQueryStatement(Loader.java:1560) > at org.hibernate.loader.Loader.doQuery(Loader.java:661) > at > org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:224) > at org.hibernate.loader.Loader.doList(Loader.java:2144) > at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2028) > at org.hibernate.loader.Loader.list(Loader.java:2023) > at org.hibernate.loader.criteria.CriteriaLoader.list(CriteriaLoader.java:95) > at org.hibernate.impl.SessionImpl.list(SessionImpl.java:1569)
Take a look at bug44157. I'm not sure I can say that these are related, but bug44157 was what was causing most of our deadlocks on Solaris only, because we would interrupt threads. See my Comment #14 and Comment #15.
Comments 18-20 do not appear to be related to the initial reported bug which from all the discussion appears to be a Solaris JVM bug and not a log4j bug. Comment 18 appears to be occurring due to Hibernate using an Object as a message and that Object's toString() method in turn triggers logging calls. Unfortunately log4j 1.2 is not designed to be reentrant, you can't make log4j logging calls reliably from within log4j. Options would be to configure an Object renderer that would avoid calling the problematic toString() call or removing the logging calls in the problematic toString() call. The stack traces in 19 and 20 clearly look like the problematic pre-log4j 1.2.14 AsyncAppender which had many synchronization issues. The suggest resolution there would be to upgrade to log4j 1.2.15 or later. If there are new issues, then should be logged as new bug reports.
If this is resolved in 1.2.15 does that mean that this issue is resolved as fixed?
It's somewhat unclear from the above posts if we're talking about a deadlock or just a lock contention/scalability issue. I can confirm that there does seem to be a scalability issue. This is with log4j 1.2.5, using it in a webapp that runs under tomcat, so there will be tens or hundreds of threads calling in to log4j simultaneosly. What were seeing is typically myriads of these: "TP-Processor28" daemon prio=10 tid=0x08449800 nid=0x361 waiting for monitor ent ry [0x06c6d000..0x06c6e050] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.log4j.Category.callAppenders(Category.java:204) - waiting to lock <0x841e4b10> (a org.apache.log4j.spi.RootLogger) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.debug(Category.java:260) All blocking against one that looks like similar to this: "TP-Processor120" daemon prio=10 tid=0x0852e400 nid=0x4c6 waiting for monitor en try [0x07c81000..0x07c821d0] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.log4j.Category.callAppenders(Category.java:204) - locked <0x841e4b10> (a org.apache.log4j.spi.RootLogger) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.info(Category.java:666) This isn't a deadlock, it's just showing that the coding in Category.callAppenders doesn't scale well, it synchronizes at a too high level: public void callAppenders(LoggingEvent event) { int writes = 0; for(Category c = this; c != null; c=c.parent) { // Protected against simultaneous call to addAppender, removeAppender,... synchronized(c) { if(c.aai != null) { writes += c.aai.appendLoopOnAppenders(event); } if(!c.additive) { break; } } } if(writes == 0) { repository.emitNoAppenderWarning(this); } } Now, fixing the overeager synchronization here would just expose the next bottleneck, which is the AppenderSkeleton.doAppend. synchronization should be moved down to the append method I guess.
The problem is pretty consistent and seems to be happening on high traffic servers. Here is one that I caught on my weblogic server. This lock will eventually bring the server down since the request processor / servlet engines will be unable to process any further requests. I am not sure about the lock scalability thing. My logs indicate a positive deadlock condition. Is this going to be fixed anytime soon ? java version "1.5.0_06" Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_06-b05) BEA JRockit(R) (build R26.4.0-63_CR302700-72606-1.5.0_06-20061127-1108-win-ia32, ) OS - Windows 2000 SP4 Log4j version - 1.2.15 Server - Weblogic 10 SP1 Deadlocks encountered on a web application built on Struts 1.1 Circular (deadlocked) lock chains ================================= Chain 2: ================================= "[STUCK] ExecuteThread: '39' for queue: 'weblogic.kernel.Default (self-tuning)'" id=11799 idx=0xee tid=6520 waiting for org/apache/log4j/Logger@0x05AE0EF0 held by: "[STUCK] ExecuteThread: '28' for queue: 'weblogic.kernel.Default (self-tuning)'" id=6044 idx=0x9c tid=6496 waiting for org/apache/log4j/RollingFileAppender@0x0B1E4CB8 held by: "[STUCK] ExecuteThread: '39' for queue: 'weblogic.kernel.Default (self-tuning)'" id=11799 idx=0xee tid=6520 ================================= Thread 39 ================================= "[STUCK] ExecuteThread: '39' for queue: 'weblogic.kernel.Default (self-tuning)'" id=11799 idx=0xee tid=6520 prio=1 alive, in native, blocked, daemon -- Blocked trying to get lock: org/apache/log4j/Logger@0x05AE0EF0[thin lock] at jrockit/vm/Threads.sleep(I)V(Native Method) at jrockit/vm/Locks.waitForThinRelease(Ljava/lang/Object;I)I(Unknown Source) at jrockit/vm/Locks.monitorEnterSecondStage(Ljava/lang/Object;I)Ljava/lang/Object;(Unknown Source) at jrockit/vm/Locks.monitorEnter(Ljava/lang/Object;)Ljava/lang/Object;(Unknown Source) at org/apache/log4j/Category.callAppenders(Lorg/apache/log4j/spi/LoggingEvent;)V(Category.java:201) at org/apache/log4j/Category.forcedLog(Ljava/lang/String;Lorg/apache/log4j/Priority;Ljava/lang/Object;Ljava/lang/Throwable;)V(Category.java:388) at org/apache/log4j/Category.debug(Ljava/lang/Object;)V(Category.java:257) ================================= Thread 28 ================================= "[STUCK] ExecuteThread: '28' for queue: 'weblogic.kernel.Default (self-tuning)'" id=6044 idx=0x9c tid=6496 prio=1 alive, in native, blocked, daemon -- Blocked trying to get lock: org/apache/log4j/RollingFileAppender@0x0B1E4CB8[thin lock] at jrockit/vm/Threads.sleep(I)V(Native Method) at jrockit/vm/Locks.waitForThinRelease(Ljava/lang/Object;I)I(Unknown Source) at jrockit/vm/Locks.monitorEnterSecondStage(Ljava/lang/Object;I)Ljava/lang/Object;(Unknown Source) at jrockit/vm/Locks.monitorEnter(Ljava/lang/Object;)Ljava/lang/Object;(Unknown Source) at org/apache/log4j/AppenderSkeleton.doAppend(Lorg/apache/log4j/spi/LoggingEvent;)V(AppenderSkeleton.java:210) at org/apache/log4j/helpers/AppenderAttachableImpl.appendLoopOnAppenders(Lorg/apache/log4j/spi/LoggingEvent;)I(AppenderAttachableImpl.java:65) at org/apache/log4j/Category.callAppenders(Lorg/apache/log4j/spi/LoggingEvent;)V(Category.java:203) ^-- Holding lock: org/apache/log4j/Logger@0x05AE0EF0[thin lock] at org/apache/log4j/Category.forcedLog(Ljava/lang/String;Lorg/apache/log4j/Priority;Ljava/lang/Object;Ljava/lang/Throwable;)V(Category.java:388) at org/apache/log4j/Category.debug(Ljava/lang/Object;)V(Category.java:257)
Yet another case of deadlock we encountered, This is reported by the Java Threaddump, in the log4JAPi, causing the process going down. This problem seems to be very frequent as the load of the server is quite high. The Thread dump indicates that the deadlock is caused by the "RollingFileAppender". Please reply in here, do we have any solution boiled down on this issue. We are on Java Version : java version "1.5.0_11" Log4J version : log4j-1.2.13 The platform is Sun solaris "SunOS XXX 5.10 Generic_127111-11 sun4u sparc SUNW,Sun-Fire-V240" The Deadloack threaddump encountered snippet Found one Java-level deadlock: ============================= "AlarmCounter-Thread": waiting to lock monitor 0x00149d20 (object 0xa86e6988, a java.io.PrintStream), which is held by "EventListenerThreadPool:Thread-1087" "EventListenerThreadPool:Thread-1087": waiting to lock monitor 0x00149c90 (object 0xa6804bd8, a org.apache.log4j.RollingFileAppender), which is held by "AlarmCounter-Thread" Java stack information for the threads listed above: =================================================== "AlarmCounter-Thread": at java.io.PrintStream.println(Unknown Source) - waiting to lock <0xa86e6988> (a java.io.PrintStream) at org.apache.log4j.helpers.LogLog.error(LogLog.java:142) at org.apache.log4j.helpers.OnlyOnceErrorHandler.error(OnlyOnceErrorHandler.java:77) at org.apache.log4j.helpers.OnlyOnceErrorHandler.error(OnlyOnceErrorHandler.java:67) at org.apache.log4j.helpers.CountingQuietWriter.write(CountingQuietWriter.java:48) at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:301) at org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:234) at org.apache.log4j.WriterAppender.append(WriterAppender.java:159) at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:230) - locked <0xa6804bd8> (a org.apache.log4j.RollingFileAppender) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:65) at org.apache.log4j.Category.callAppenders(Category.java:203) - locked <0xa680b0b8> (a org.apache.log4j.Logger) at org.apache.log4j.Category.forcedLog(Category.java:388) at org.apache.log4j.Category.error(Category.java:319) at com.airvana.faultServer.db.SQLUtil.executeQuery(SQLUtil.java:480) at com.airvana.faultServer.db.SQLUtil.executeQuery(SQLUtil.java:496) at com.airvana.faultServer.db.AlarmCounter.initializeFromDb(AlarmCounter.java:71) - locked <0xa8710e60> (a com.airvana.faultServer.db.AlarmCounter) at com.airvana.faultServer.db.AlarmCounter.run(AlarmCounter.java:173) at java.lang.Thread.run(Unknown Source) "EventListenerThreadPool:Thread-1087": at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:210) - waiting to lock <0xa6804bd8> (a org.apache.log4j.RollingFileAppender) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:65) at org.apache.log4j.Category.callAppenders(Category.java:203) - locked <0xa68047a8> (a org.apache.log4j.spi.RootLogger) at org.apache.log4j.Category.forcedLog(Category.java:388) at org.apache.log4j.Category.log(Category.java:835) at com.airvana.faultServer.utils.LoggingOutputStream.flush(LoggingOutputStream.java:189) at java.io.PrintStream.write(Unknown Source) - locked <0xa86e6988> (a java.io.PrintStream) at sun.nio.cs.StreamEncoder$CharsetSE.writeBytes(Unknown Source) at sun.nio.cs.StreamEncoder$CharsetSE.implWrite(Unknown Source) at sun.nio.cs.StreamEncoder.write(Unknown Source) - locked <0xa8809018> (a java.io.OutputStreamWriter) at java.io.OutputStreamWriter.write(Unknown Source) at java.io.BufferedWriter.flushBuffer(Unknown Source) - locked <0xa8809018> (a java.io.OutputStreamWriter) at java.io.BufferedWriter.write(Unknown Source) - locked <0xa8809018> (a java.io.OutputStreamWriter) at java.io.Writer.write(Unknown Source) at java.io.PrintStream.write(Unknown Source) - locked <0xa86e6988> (a java.io.PrintStream) at java.io.PrintStream.print(Unknown Source) at java.lang.ThreadGroup.uncaughtException(Unknown Source) at java.lang.ThreadGroup.uncaughtException(Unknown Source) at java.lang.Thread.dispatchUncaughtException(Unknown Source) Found 1 deadlock.
Hi All, I wanted to add more data points on behalf of Vijay (post #26) above. 1. The disk space on the partition to which the logs are written was 100% full at some point in time on the server. Though we have not tried to reproduce it this way yet, but needed some confirmation on Log4J behavior under this condition. We are not even sure whether disk full can cause this issue. Could this have caused the deadlock? How does Log4J behave upon a disk full condition? 2. As per the stack trace thread (EventListenerThreadPool:Thread-1087) had encountered an uncaught-exception. We are not yet sure what this exception was, BUT sadly that same call/thread 'deadlocked' and no logs (diskfull?) got dumped. . . at java.io.PrintStream.print(Unknown Source) at java.lang.ThreadGroup.uncaughtException(Unknown Source) at java.lang.ThreadGroup.uncaughtException(Unknown Source) at java.lang.Thread.dispatchUncaughtException(Unknown Source) I am not sure whether this is a known issue/behavior in Log4J. But wanted to understand how we could resolve this. Will upgrading to Log4J 1.2.15 help? 1.2.14 did have deadlock fixes for AsyncAppender, but RollingFileAppender also seem to have the same bug.
Hi We are using Log4j 1.2_14 under jboss environment (i.e log4j-boot.jar). Frequent jvm crashes are happening in high production environments because of Log4j RollingFileAppender. Log Level is DEBUG. Here is the stack trace: Thread: http-0.0.0.0-8080-4 : priority:5, demon:true, threadId:104, threadState:BLOCKED, lockName:org.jboss.logging.appender.RollingFileAppender@718bc0c4 org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:210) org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:65) org.apache.log4j.Category.forcedLog(Category.java:388) org.apache.log4j.Category.info(Category.java:663) Applied patch proposed by http://marc.info/?l=log4j-dev&m=121271153819013&w=2 but still problem doesn't solved. This is the problem with synchronization and multiple threads like 5000 or more per sec (in production)trying to access the application and respective log statements will be called for each thread. Our application strictly enforces using smartDebug, smartError, smartInfo etc... Tried using AysncAppender with the above patch. It solved the synchronization problem in callAppenders() method but now AppenderSkeleton's doAppend() is synchronized. org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:231) org.apache.log4j.helpers.AppenderAttachableImpl5.appendLoopOnAppenders(AppenderAttachableImpl5.java:79) org.apache.log4j.Category.callAppenders(Category.java:203) org.apache.log4j.Category.forcedLog(Category.java:387) org.apache.log4j.Category.info(Category.java:662) For sure this is a synchronization problem irrespective on which application sever you are using. Synchronization needs to be moved to lower level i.e (append method). Jboss tightly integrates log4j and applying patch is also cumbersome. Before posting here tried every single option to solve the issue but no luck.
Any updates on this? I have the same problem: @see: "Thread-6" daemon prio=10 tid=0x00007feb02a29800 nid=0x387f waiting for monitor entry [0x00000000418bb000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.log4j.Category.callAppenders(Category.java:204) - waiting to lock <0x00007feb091fab30> (a org.apache.log4j.spi.RootLogger) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.log(Category.java:856) at org.apache.commons.logging.impl.Log4JLogger.debug(Log4JLogger.java:177) at org.apache.jcs.engine.memory.shrinking.ShrinkerThread.shrink(ShrinkerThread.java:124) at org.apache.jcs.engine.memory.shrinking.ShrinkerThread.run(ShrinkerThread.java:96) at EDU.oswego.cs.dl.util.concurrent.ClockDaemon$RunLoop.run(Unknown Source) at java.lang.Thread.run(Thread.java:619)
(In reply to comment #29) > Any updates on this? > > I have the same problem: > > "Thread-6" daemon prio=10 tid=0x00007feb02a29800 nid=0x387f waiting for monitor > entry [0x00000000418bb000] > java.lang.Thread.State: BLOCKED (on object monitor) > at org.apache.log4j.Category.callAppenders(Category.java:204) > - waiting to lock <0x00007feb091fab30> (a I have a very similar problem: "http-13180-143" daemon prio=10 tid=0x00002aac3c5bb000 nid=0x262c2 waiting for monitor entry [0x00002aac300bc000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.log4j.Category.callAppenders(Category.java:201) - locked <0x00002aaab8919168> (a org.apache.log4j.spi.RootLogger) at org.apache.log4j.Category.forcedLog(Category.java:388) at org.apache.log4j.Category.error(Category.java:302) Please assist.
Below Log4J dead lock was seen when running on 1.2.15. We have upgraded to 1.2.16 now, but not sure whether this issue is already fixed in the latest version. Found one Java-level deadlock: ============================= "pool-3-thread-13": waiting to lock monitor 0x001782d8 (object 0xa85627f0, a java.io.PrintStream), which is held by "pool-3-thread-14" "pool-3-thread-14": waiting to lock monitor 0x00178320 (object 0xa6400ee0, a org.apache.log4j.RollingFileAppender), which is held by "pool-3-thread-13" Java stack information for the threads listed above: =================================================== "pool-3-thread-13": at java.lang.Throwable.printStackTrace(Throwable.java:462) - waiting to lock <0xa85627f0> (a java.io.PrintStream) at java.lang.Throwable.printStackTrace(Throwable.java:452) at org.apache.log4j.helpers.LogLog.error(LogLog.java:145) at org.apache.log4j.helpers.OnlyOnceErrorHandler.error(OnlyOnceErrorHandler.java:78) at org.apache.log4j.helpers.OnlyOnceErrorHandler.error(OnlyOnceErrorHandler.java:68) at org.apache.log4j.helpers.QuietWriter.flush(QuietWriter.java:60) at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:316) at org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:263) at org.apache.log4j.WriterAppender.append(WriterAppender.java:160) at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251) - locked <0xa6400ee0> (a org.apache.log4j.RollingFileAppender) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) at org.apache.log4j.Category.callAppenders(Category.java:206) - locked <0xa640eed8> (a org.apache.log4j.Logger) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.warn(Category.java:1043) at com.airvana.faultServer.eventEngine.AirvanaPollServer.handleNodeNotReachableState(AirvanaPollServer.java:192) at com.airvana.faultServer.niohandlers.NioNotificationHandler.declareNodeUnreachable(NioNotificationHandler.java:593) at com.airvana.faultServer.niohandlers.NioNotificationHandler.channelDisconnected(NioNotificationHandler.java:198) - locked <0xa86e8d48> (a com.airvana.faultServer.db.NodeObj) at org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:137) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:567) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:803) at org.jboss.netty.handler.execution.ChannelEventRunnable.run(ChannelEventRunnable.java:76) at org.jboss.netty.handler.execution.OrderedMemoryAwareThreadPoolExecutor$ChildExecutor.run(OrderedMemoryAwareThreadPoolExecutor.java:314) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:651) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:676) at java.lang.Thread.run(Thread.java:595) "pool-3-thread-14": at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:231) - waiting to lock <0xa6400ee0> (a org.apache.log4j.RollingFileAppender) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) at org.apache.log4j.Category.callAppenders(Category.java:206) - locked <0xa6400b18> (a org.apache.log4j.spi.RootLogger) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.log(Category.java:838) at com.airvana.faultServer.utils.LoggingOutputStream.flush(LoggingOutputStream.java:185) at java.io.PrintStream.write(PrintStream.java:414) - locked <0xa85627f0> (a java.io.PrintStream) at sun.nio.cs.StreamEncoder$CharsetSE.writeBytes(StreamEncoder.java:336) at sun.nio.cs.StreamEncoder$CharsetSE.implFlushBuffer(StreamEncoder.java:404) at sun.nio.cs.StreamEncoder.flushBuffer(StreamEncoder.java:115) - locked <0xa8c04e00> (a java.io.OutputStreamWriter) at java.io.OutputStreamWriter.flushBuffer(OutputStreamWriter.java:169) at java.io.PrintStream.write(PrintStream.java:459) - locked <0xa85627f0> (a java.io.PrintStream) at java.io.PrintStream.print(PrintStream.java:602) at java.io.PrintStream.println(PrintStream.java:739) - locked <0xa85627f0> (a java.io.PrintStream) at org.apache.log4j.helpers.LogLog.error(LogLog.java:143) at org.apache.log4j.helpers.OnlyOnceErrorHandler.error(OnlyOnceErrorHandler.java:78) at org.apache.log4j.helpers.OnlyOnceErrorHandler.error(OnlyOnceErrorHandler.java:68) at org.apache.log4j.helpers.QuietWriter.flush(QuietWriter.java:60) at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:316) at org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:263) at org.apache.log4j.WriterAppender.append(WriterAppender.java:160) at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251) - locked <0xa6401628> (a org.apache.log4j.RollingFileAppender) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) at org.apache.log4j.Category.callAppenders(Category.java:206) - locked <0xa6401388> (a org.apache.log4j.Logger) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.info(Category.java:666) at com.airvana.faultServer.niohandlers.NioNotificationHandler.declareNodeUnreachable(NioNotificationHandler.java:586) - locked <0xa8596850> (a java.lang.Integer) at com.airvana.faultServer.niohandlers.NioNotificationHandler.channelDisconnected(NioNotificationHandler.java:198) - locked <0xa89b0a90> (a com.airvana.faultServer.db.NodeObj) at org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:137) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:567) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:803) at org.jboss.netty.handler.execution.ChannelEventRunnable.run(ChannelEventRunnable.java:76) at org.jboss.netty.handler.execution.OrderedMemoryAwareThreadPoolExecutor$ChildExecutor.run(OrderedMemoryAwareThreadPoolExecutor.java:314) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:651) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:676) at java.lang.Thread.run(Thread.java:595) Found 1 deadlock.
Log4J version 1.2.16: Deadlock: [2/1/11 15:22:44:977 EST] 00000005 TimeoutManage I WTRN0124I: When the timeout occurred the thread with which the transaction is, or was most recently, associated was Thread[WMQJCAResourceAdapter : 3,5,main]. The stack trace of this thread when the timeout occurred was: java.io.UnixFileSystem.getBooleanAttributes0(Native Method) java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:240) java.io.File.exists(File.java:733) org.apache.log4j.RollingFileAppender.rollOver(RollingFileAppender.java:151) org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:280) org.apache.log4j.WriterAppender.append(WriterAppender.java:162) org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251) org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) org.apache.log4j.Category.callAppenders(Category.java:206) org.apache.log4j.Category.forcedLog(Category.java:391) org.apache.log4j.Category.debug(Category.java:260) Component1Impl.logDebug(Component1Impl.java:140) Multiple-threaded application, installed on WebSphere Application Server v7, picking messages from a JMS Queue on WebSphere MQ v7. At the time 10 connections where opened retrieving messages from the queue and writing to a log file. Application stopped while Log4J tried to rollover the log file, timing out the session. So, how can we fix this? Thanks! Adilson Dias
Replaced log4j with logback and everything works fine. No deadlock. Cheers, Adilson Dias (In reply to comment #32) > Log4J version 1.2.16: > Deadlock: > [2/1/11 15:22:44:977 EST] 00000005 TimeoutManage I WTRN0124I: When the > timeout occurred the thread with which the transaction is, or was most > recently, associated was Thread[WMQJCAResourceAdapter : 3,5,main]. The stack > trace of this thread when the timeout occurred was: > java.io.UnixFileSystem.getBooleanAttributes0(Native Method) > java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:240) > java.io.File.exists(File.java:733) > org.apache.log4j.RollingFileAppender.rollOver(RollingFileAppender.java:151) > > org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:280) > org.apache.log4j.WriterAppender.append(WriterAppender.java:162) > org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251) > > org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) > org.apache.log4j.Category.callAppenders(Category.java:206) > org.apache.log4j.Category.forcedLog(Category.java:391) > org.apache.log4j.Category.debug(Category.java:260) > Component1Impl.logDebug(Component1Impl.java:140) > > Multiple-threaded application, installed on WebSphere Application Server v7, > picking messages from a JMS Queue on WebSphere MQ v7. At the time 10 > connections where opened retrieving messages from the queue and writing to a > log file. > Application stopped while Log4J tried to rollover the log file, timing out the > session. > > So, how can we fix this? > Thanks! > Adilson Dias
hi Adilson, can you say what version of logback you used? I am using Logback 0.9.18 (released 03-Dec-2009) and still seeing what looks like the same problem - one thread ("http-8443-6") doing a log message, deadlocked waiting on another thread ("pool-4-thread-1") doing a log within a file compression operation: Java stack information for the threads listed above: =================================================== "pool-4-thread-1": at ch.qos.logback.core.rolling.RollingFileAppender.subAppend(RollingFileAppender.java:141) - waiting to lock <0x00002aaad13089f8> (a ch.qos.logback.core.rolling.TimeBasedRollingPolicy) at ch.qos.logback.core.WriterAppender.append(WriterAppender.java:120) at ch.qos.logback.core.UnsynchronizedAppenderBase.doAppend(UnsynchronizedAppenderBase.java:93) at ch.qos.logback.core.spi.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:64) at ch.qos.logback.classic.Logger.appendLoopOnAppenders(Logger.java:275) at ch.qos.logback.classic.Logger.callAppenders(Logger.java:262) at ch.qos.logback.classic.Logger.buildLoggingEventAndAppend(Logger.java:465) at ch.qos.logback.classic.Logger.filterAndLog_0_Or3Plus(Logger.java:419) at ch.qos.logback.classic.Logger.info(Logger.java:623) at com.aepona.tws.portal.common.logging.LogbackStatusChangeListener.addStatusEvent_aroundBody0(LogbackStatusChangeListener.java:43) at com.aepona.tws.portal.common.logging.LogbackStatusChangeListener.addStatusEvent_aroundBody1$advice(LogbackStatusChangeListener.java:61) at com.aepona.tws.portal.common.logging.LogbackStatusChangeListener.addStatusEvent(LogbackStatusChangeListener.java:41) at ch.qos.logback.core.BasicStatusManager.fireStatusAddEvent(BasicStatusManager.java:86) - locked <0x00002aaace451c30> (a java.lang.Object) at ch.qos.logback.core.BasicStatusManager.add(BasicStatusManager.java:58) at ch.qos.logback.core.spi.ContextAwareBase.addStatus(ContextAwareBase.java:71) at ch.qos.logback.core.spi.ContextAwareBase.addInfo(ContextAwareBase.java:76) at ch.qos.logback.core.rolling.helper.Compressor.compress(Compressor.java:59) at ch.qos.logback.core.rolling.helper.CompressionRunnable.run(CompressionRunnable.java:31) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:207) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) "http-8443-6": at ch.qos.logback.core.BasicStatusManager.fireStatusAddEvent(BasicStatusManager.java:85) - waiting to lock <0x00002aaace451c30> (a java.lang.Object) at ch.qos.logback.core.BasicStatusManager.add(BasicStatusManager.java:58) at ch.qos.logback.core.spi.ContextAwareBase.addStatus(ContextAwareBase.java:71) at ch.qos.logback.core.spi.ContextAwareBase.addInfo(ContextAwareBase.java:76) at ch.qos.logback.core.rolling.helper.DefaultArchiveRemover.clean(DefaultArchiveRemover.java:78) at ch.qos.logback.core.rolling.TimeBasedRollingPolicy.rollover(TimeBasedRollingPolicy.java:139) at ch.qos.logback.core.rolling.RollingFileAppender.rollover(RollingFileAppender.java:110) - locked <0x00002aaad1307328> (a ch.qos.logback.core.rolling.RollingFileAppender) at ch.qos.logback.core.rolling.RollingFileAppender.subAppend(RollingFileAppender.java:142) - locked <0x00002aaad13089f8> (a ch.qos.logback.core.rolling.TimeBasedRollingPolicy) at ch.qos.logback.core.WriterAppender.append(WriterAppender.java:120) at ch.qos.logback.core.UnsynchronizedAppenderBase.doAppend(UnsynchronizedAppenderBase.java:93) at ch.qos.logback.core.spi.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:64) at ch.qos.logback.classic.Logger.appendLoopOnAppenders(Logger.java:275) at ch.qos.logback.classic.Logger.callAppenders(Logger.java:262) at ch.qos.logback.classic.Logger.buildLoggingEventAndAppend(Logger.java:465) at ch.qos.logback.classic.Logger.filterAndLog_0_Or3Plus(Logger.java:419) at ch.qos.logback.classic.Logger.info(Logger.java:623) (In reply to comment #33) > Replaced log4j with logback and everything works fine. No deadlock. > Cheers, > Adilson Dias >
####<Feb 16, 2011 7:05:25 PM GST> <Critical> <WebLogicServer> <camp1s> <CAMAProd> <[ACTIVE] ExecuteThread: '6' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <> <1297868725143> <BEA-000394> < DEADLOCK DETECTED: ================== [deadlocked thread] [ACTIVE] ExecuteThread: '4' for queue: 'weblogic.kernel.Default (self-tuning)': -------------------------------------------------------------------------------------------------- Thread '[ACTIVE] ExecuteThread: '4' for queue: 'weblogic.kernel.Default (self-tuning)'' is waiting to acquire lock 'org.apache.log4j.RollingFileAppender@18cceaf' that is held by thread '[ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)'' Stack trace: ------------ org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:210) org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:65) org.apache.log4j.Category.callAppenders(Category.java:203) org.apache.log4j.Category.forcedLog(Category.java:388) org.apache.log4j.Category.debug(Category.java:257) com.company.product.jdbc.MyJDBC.selectAllObjects(MyJDBC.java:294) com.company.product.business.MyBusinessMgrImpl.selectAllObjects(MyBusinessMgrImpl.java:155) ... [deadlocked thread] [ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)': -------------------------------------------------------------------------------------------------- Thread '[ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)'' is waiting to acquire lock 'org.apache.log4j.Logger@155742b' that is held by thread '[ACTIVE] ExecuteThread: '4' for queue: 'weblogic.kernel.Default (self-tuning)'' Stack trace: ------------ org.apache.log4j.Category.callAppenders(Category.java:201) org.apache.log4j.Category.forcedLog(Category.java:388) org.apache.log4j.Category.warn(Category.java:1008) com.company.product.datatypes.BusinessException.toString(BusinessException.java:757) org.apache.log4j.spi.VectorWriter.println(ThrowableInformation.java:97) java.lang.Throwable.printStackTrace(Throwable.java:510) org.apache.log4j.spi.ThrowableInformation.getThrowableStrRep(ThrowableInformation.java:59) org.apache.log4j.spi.LoggingEvent.getThrowableStrRep(LoggingEvent.java:342) org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:304) org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:234) org.apache.log4j.WriterAppender.append(WriterAppender.java:159) org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:230) org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:65) org.apache.log4j.Category.callAppenders(Category.java:203) org.apache.log4j.Category.forcedLog(Category.java:388) org.apache.log4j.Category.error(Category.java:319) com.company.framework.ejb.connectivity.BaseMessageDrivenBean.doRollback(BaseMessageDrivenBean.java:111) ... Before diving into the explanation, you need a few code samples to fully understand: The BusinessException extends Exception and implements the toString() method with some logging inside. public class BusinessException extends Exception { private static final Logger log = Logger.getLogger(BusinessException.class); @Override public String toString() { ... log.debug("some logging in the toString()"); ... } } The Category.callAppenders() method makes its way up the Logger hierarchy (log4j Logger hierarchy knowledge needed). On each iteration the Logger is locked with a synchronized block. Once in the synchronized block, if an AppenderAttachableImpl exists on the current Logger, it calls AppenderAttachableImpl.appendLoopOnAppenders(LoggingEvent). public class Category implements AppenderAttachable { public void callAppenders(LoggingEvent event) { int writes = 0; for(Category c = this; c != null; c=c.parent) { // Protected against simultaneous call to addAppender, removeAppender,... synchronized(c) { if(c.aai != null) { writes += c.aai.appendLoopOnAppenders(event); } if(!c.additive) { break; } } } if(writes == 0) { repository.emitNoAppenderWarning(this); } } } In AppenderAttachableImpl.appendLoopOnAppenders(LoggingEvent), the method Appender.doAppend(LoggingEvent) is called. public class AppenderAttachableImpl implements AppenderAttachable { public int appendLoopOnAppenders(LoggingEvent event) { int size = 0; Appender appender; if(appenderList != null) { size = appenderList.size(); for(int i = 0; i < size; i++) { appender = (Appender) appenderList.elementAt(i); appender.doAppend(event); } } return size; } } Finally, the RollingFileAppender indirectly implementes Appender via the indirectly extended AppenderSkeleton class. AppenderSkeleton.doAppend(LoggingEvent) is a synchonized method, so calling RollingFileAppender.doAppend(LoggingEvent) will lock the current RollingFileAppender object. public abstract class AppenderSkeleton implements Appender, OptionHandler { public synchronized void doAppend(LoggingEvent event) { ... } } What follows is what I found was happening. We have the following Logger hierarchy: RootLogger +- com.company - RollingFileAppender attached +- com.company.product.jdbc.MyJDBC (ExecuteThread: '4') +- com.company.product.datatypes.BusinessException (ExecuteThread: '0') +- framework.core.ejb.connectivity.BaseMessageDrivenBean (ExecuteThread: '0') - RollingFileAppender attached I think the sequence of event is something like this: 1. ExecuteThread: '0' – is logging on the BaseMessageDrivenBean logger with a BusinessException (which implements toString()) 2. ExecuteThread: '0' - in Category.callAppenders(), it starts to make its way up in the Logger hierarchy (locking the current Logger on each iteration). 3. ExecuteThread: '0' - it reaches and locks the "framework.core.ejb.connectivity.BaseMessageDrivenBean" Logger, where it finds a RollingFileAppender attached. It calls AppenderAttachableImpl.appendLoopOnAppenders(LoggingEvent), which ends up calling synchronized method doAppend(LoggingEvent) on a RollingFileAppender object, locking it at the same time. 4. ExecuteThread: '4' – is logging something on the MyJDBC logger 5. ExecuteThread: '4' - in Category.callAppenders(), it starts to make its way up in the Logger hierarchy (locking the current Logger on each iteration). 6. ExecuteThread: '4' - it reaches and locks the "com.company" Logger, where it finds a RollingFileAppender attached. It calls AppenderAttachableImpl.appendLoopOnAppenders(LoggingEvent), which ends up calling the synchronized method doAppend(LoggingEvent) on the _already locked_ RollingFileAppender object. *ExecuteThread: '4' is blocked waiting on ExecuteThread: '0'* 7. ExecuteThread: '0' – goes on with its execution inside the RollingFileAppender. This ends up logging the BusinessException, which indirectly calls BusinessException.toString(). Inside the toString() method come logging is called on the BusinessException Logger. 8. ExecuteThread: '0' - in Category.callAppenders(), it starts to make its way up in the Logger hierarchy (locking the current Logger on each iteration). 9. ExecuteThread: '0' - it reaches "com.company" and tries to lock it, but it is already locked by ExecuteThread: '4'. *ExecuteThread: '0' is blocked waiting on ExecuteThread: '4'* THIS IS A DEADLOCK SITUATION. This is not really a problem with log4j's itself. The problem lies more in how log4j is configured, and a complex sequence of method calls exhibiting a deadlock situation with java locks on synchronized blocks. Using the RollingFileAppender probably makes things worse: if a file rolling (very slow compared to running java code) is happening in this sequence of events, then it greatly increases the chances of reaching such problem.
Agree that the deadlock is not caused by log4j itself. It's caused by the design of the given app, i.e.: - the fact that logging an exception through category A causes the exception to log an event to log4j through category B; ... and both categories use the same appender - using renderers that log to log4j inside their code in a scenrio similar to the one described above While it's not log4j that is the direct cause of these issues, if log4j classes were synchronized differently, these problems would not be observed. If org.apache.log4j.Category.callAppenders() would use a reentrant read write lock, thread that called category A, then called appender which caused category B to be called would not be blocked on org.apache.log4j.Category.callAppenders(). What is more, as org.apache.log4j.Category.callAppenders() is no longer a problem, this thread would be able to reenter the lock on org.apache.log4j.AppenderSkeleton.doAppend(). Deadlock would not be observed. I also mentioned changing locking mechanism used in org.apache.log4j.Category here https://issues.apache.org/bugzilla/show_bug.cgi?id=50213#c6 I'll create a seperate Bugzilla ticket for this propsed refactoring. Oh, of course introducing async appenders could be used as a workaround for these serious issues - for deadlocks. However, org.apache.log4j.Category.callAppenders() is anyway suboptimal. Bartek
I've just created https://issues.apache.org/bugzilla/attachment.cgi?bugid=51047 (Move org.apache.log4j.Category to reentrant read/write locks). This change should get rid of issues with synchronization and deadlocks.
If we start requiring Java 5 for log4j, then there's a nice solution here for callAppenders: use a CopyOnWriteArrayList in AppenderAttachableImpl and thus avoid any locking when iterating over appenders. Without Java 5 you could implement your own CopyOnWriteArrayList, of course. That doesn't solve some application design issues, of course -- as AppenderSkeleton has a synchronized doAppend(). Many subclasses rely on this locking, so it can't be blithely removed. Removing the locking across all appenders for a given logger does eliminate some such issues, though. More importantly, it eliminates a key bottleneck in log4j. It makes sense that only one thread can write to a text file at a time. It makes no sense that no thread can write to a text file just because some other thread is writing to the console (or some such), though. Also many threads should be able to determine whether the filter is going to reject the event, format the event going to the text file, etc, at the same time. The locking really should be *very* last minute. Getting rid of the locking at the callAppenders() level is a big step in the right direction, though -- one which I've long since been taking by patching in an alternate implementation of AppenderAttachableImpl that uses CopyOnWriteArrayList as pre-Java-6 JVMs are ancient history to me at this point.
(In reply to comment #38) Can you move this comment over to 51047?
(In reply to comment #38) > If we start requiring Java 5 for log4j, then there's a nice solution here for > callAppenders: use a CopyOnWriteArrayList in AppenderAttachableImpl and thus > avoid any locking when iterating over appenders. Without Java 5 you could > implement your own CopyOnWriteArrayList, of course. There is very little chance Log4j 1.x is ever going to upgrade to Java 5. I have been working on Log4j 2.0 in an experimental branch which addresses this issue as well as many others, including some design defects that are present in Logback. Unfortunately, Log4j 2.0 still needs a fair amount of work. Volunteers are welcome!
(In reply to comment #35) > ####<Feb 16, 2011 7:05:25 PM GST> <Critical> <WebLogicServer> <camp1s> > <CAMAProd> <[ACTIVE] ExecuteThread: '6' for queue: 'weblogic.kernel.Default > (self-tuning)'> <<WLS Kernel>> <> <> <1297868725143> <BEA-000394> < > > DEADLOCK DETECTED: > ================== > > [deadlocked thread] [ACTIVE] ExecuteThread: '4' for queue: > 'weblogic.kernel.Default (self-tuning)': > ----------------------------------------------------------------------------- > --------------------- > Thread '[ACTIVE] ExecuteThread: '4' for queue: 'weblogic.kernel.Default > (self-tuning)'' is waiting to acquire lock > 'org.apache.log4j.RollingFileAppender@18cceaf' that is held by thread > '[ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default > (self-tuning)'' > > Stack trace: > ------------ > org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:210) > org.apache.log4j.helpers.AppenderAttachableImpl. > appendLoopOnAppenders(AppenderAttachableImpl.java:65) > org.apache.log4j.Category.callAppenders(Category.java:203) > org.apache.log4j.Category.forcedLog(Category.java:388) > org.apache.log4j.Category.debug(Category.java:257) > com.company.product.jdbc.MyJDBC.selectAllObjects(MyJDBC.java:294) > com.company.product.business.MyBusinessMgrImpl. > selectAllObjects(MyBusinessMgrImpl.java:155) > ... > > [deadlocked thread] [ACTIVE] ExecuteThread: '0' for queue: > 'weblogic.kernel.Default (self-tuning)': > ----------------------------------------------------------------------------- > --------------------- > Thread '[ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default > (self-tuning)'' is waiting to acquire lock 'org.apache.log4j.Logger@155742b' > that is held by thread '[ACTIVE] ExecuteThread: '4' for queue: > 'weblogic.kernel.Default (self-tuning)'' > > Stack trace: > ------------ > org.apache.log4j.Category.callAppenders(Category.java:201) > org.apache.log4j.Category.forcedLog(Category.java:388) > org.apache.log4j.Category.warn(Category.java:1008) > com.company.product.datatypes.BusinessException.toString(BusinessException. > java:757) > org.apache.log4j.spi.VectorWriter.println(ThrowableInformation.java:97) > java.lang.Throwable.printStackTrace(Throwable.java:510) > org.apache.log4j.spi.ThrowableInformation. > getThrowableStrRep(ThrowableInformation.java:59) > org.apache.log4j.spi.LoggingEvent.getThrowableStrRep(LoggingEvent.java:342) > org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:304) > org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:234) > org.apache.log4j.WriterAppender.append(WriterAppender.java:159) > org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:230) > org.apache.log4j.helpers.AppenderAttachableImpl. > appendLoopOnAppenders(AppenderAttachableImpl.java:65) > org.apache.log4j.Category.callAppenders(Category.java:203) > org.apache.log4j.Category.forcedLog(Category.java:388) > org.apache.log4j.Category.error(Category.java:319) > com.company.framework.ejb.connectivity.BaseMessageDrivenBean. > doRollback(BaseMessageDrivenBean.java:111) > ... > > > Before diving into the explanation, you need a few code samples to fully > understand: > The BusinessException extends Exception and implements the toString() method > with some logging inside. > public class BusinessException extends Exception { > private static final Logger log = > Logger.getLogger(BusinessException.class); > > @Override > public String toString() { > ... > log.debug("some logging in the toString()"); > ... > } > } > > > The Category.callAppenders() method makes its way up the Logger hierarchy > (log4j Logger hierarchy knowledge needed). On each iteration the Logger is > locked with a synchronized block. Once in the synchronized block, if an > AppenderAttachableImpl exists on the current Logger, it calls > AppenderAttachableImpl.appendLoopOnAppenders(LoggingEvent). > public class Category implements AppenderAttachable { > public void callAppenders(LoggingEvent event) { > int writes = 0; > > for(Category c = this; c != null; c=c.parent) { > // Protected against simultaneous call to addAppender, > removeAppender,... > synchronized(c) { > if(c.aai != null) { > writes += c.aai.appendLoopOnAppenders(event); > } > if(!c.additive) { > break; > } > } > } > > if(writes == 0) { > repository.emitNoAppenderWarning(this); > } > } > } > > In AppenderAttachableImpl.appendLoopOnAppenders(LoggingEvent), the method > Appender.doAppend(LoggingEvent) is called. > public class AppenderAttachableImpl implements AppenderAttachable { > public int appendLoopOnAppenders(LoggingEvent event) { > int size = 0; > Appender appender; > > if(appenderList != null) { > size = appenderList.size(); > for(int i = 0; i < size; i++) { > appender = (Appender) appenderList.elementAt(i); > appender.doAppend(event); > } > } > return size; > } > } > > Finally, the RollingFileAppender indirectly implementes Appender via the > indirectly extended AppenderSkeleton class. > AppenderSkeleton.doAppend(LoggingEvent) is a synchonized method, so calling > RollingFileAppender.doAppend(LoggingEvent) will lock the current > RollingFileAppender object. > public abstract class AppenderSkeleton implements Appender, OptionHandler { > public synchronized void doAppend(LoggingEvent event) { > ... > } > } > > > > What follows is what I found was happening. > We have the following Logger hierarchy: > > > RootLogger > +- com.company - RollingFileAppender attached > +- com.company.product.jdbc.MyJDBC > (ExecuteThread: '4') > +- com.company.product.datatypes.BusinessException > (ExecuteThread: '0') > +- framework.core.ejb.connectivity.BaseMessageDrivenBean > (ExecuteThread: '0') - RollingFileAppender attached > > > I think the sequence of event is something like this: > 1. ExecuteThread: '0' – is logging on the BaseMessageDrivenBean logger with > a BusinessException (which implements toString()) > 2. ExecuteThread: '0' - in Category.callAppenders(), it starts to make its > way up in the Logger hierarchy (locking the current Logger on each > iteration). > 3. ExecuteThread: '0' - it reaches and locks the > "framework.core.ejb.connectivity.BaseMessageDrivenBean" Logger, where it > finds a RollingFileAppender attached. It calls > AppenderAttachableImpl.appendLoopOnAppenders(LoggingEvent), which ends up > calling synchronized method doAppend(LoggingEvent) on a RollingFileAppender > object, locking it at the same time. > 4. ExecuteThread: '4' – is logging something on the MyJDBC logger > 5. ExecuteThread: '4' - in Category.callAppenders(), it starts to make its > way up in the Logger hierarchy (locking the current Logger on each > iteration). > 6. ExecuteThread: '4' - it reaches and locks the "com.company" Logger, where > it finds a RollingFileAppender attached. It calls > AppenderAttachableImpl.appendLoopOnAppenders(LoggingEvent), which ends up > calling the synchronized method doAppend(LoggingEvent) on the _already > locked_ RollingFileAppender object. *ExecuteThread: '4' is blocked waiting > on ExecuteThread: '0'* > 7. ExecuteThread: '0' – goes on with its execution inside the > RollingFileAppender. This ends up logging the BusinessException, which > indirectly calls BusinessException.toString(). Inside the toString() method > come logging is called on the BusinessException Logger. > 8. ExecuteThread: '0' - in Category.callAppenders(), it starts to make its > way up in the Logger hierarchy (locking the current Logger on each > iteration). > 9. ExecuteThread: '0' - it reaches "com.company" and tries to lock it, but > it is already locked by ExecuteThread: '4'. *ExecuteThread: '0' is blocked > waiting on ExecuteThread: '4'* > > THIS IS A DEADLOCK SITUATION. > > This is not really a problem with log4j's itself. The problem lies more in > how log4j is configured, and a complex sequence of method calls exhibiting a > deadlock situation with java locks on synchronized blocks. Using the > RollingFileAppender probably makes things worse: if a file rolling (very > slow compared to running java code) is happening in this sequence of events, > then it greatly increases the chances of reaching such problem. Actually, I think the solution (or a mitigation) might be simple. I suggest the following "workaround" modifications, in the Log4J version 1.2.17 (the last version): 1)In the org.apache.log4j.spi.LoggingEvent class, inside all the constructor's: Replace: this.message = message; By: this.message = message != null ? message.toString() : null; Doing so, you can make the "message" instance field always of the type String. 2)In the org.apache.log4j.spi.ThrowableInformation class: 2.1)Create the following instance field: private final String[] stackTraceArray; And, inside all the constructors, put the following piece of code: this.stackTraceArray = getThrowableStrRep(); The intention here is using "stackTraceArray" instance field instead of calling getThrowableStrRep() later. 2.2)Make getThrowableStrRep() private. Doing so, there is no need to maintain the synchronization of this method. Actually, I think there would be no need of synchronizing this method even if it were public, because I didn't see any place in Log4j source codes where an instance of ThrowableInformation is being shared by more than one thread. I do not understand what was the intention of the developer that created that method, maybe he/she predicted a future necessity of synchronization, as ThrowableInformation is public and has public constructors. We never know the sort of things the other programmers will do with our public classes and methods (the "bug in Log4j" we are discussing here shows exactly such a situation, doesn't it?) 2.3)Create a public method called "getStackTraceArray()" that returns "stackTraceArray" instance field (or we can also return a clone of this array). 3)In all the places where getThrowableStrRep() is being called, replace by getStackTraceArray() invocation. This replacement should be done in the following classes: org.apache.log4j.pattern.ThrowableInformationPatternConverter org.apache.log4j.spi.LoggingEvent org.apache.log4j.lf5.Log4JLogRecord I do not think it is related just to RollingFileAppender, as the title of this page suggests. That's just a synchronization among threads problem, not diretly related to any appender. Trying to understand who is the "guilty" of the problem (whether is Log4j or the application/framework that uses Log4j) is not so important. What is revelant is the fact that we have a big serious problem that has to be solved. Otherwise, we can never rely on an multithreaded application subject to a Log4j deadlock. If the solution I proposed doesn't solve the problem, I will really consider the alternative of removing Log4j from my application. I will sadly do that, although I like Log4j so much. But I tested my solution, and I think it works and covers all the critical cases. Of course, as I said, this solution might be considered just a mitigation, not a real solution. In my opinion, a real and elegant solution would be done through a more deep refactoring.
It's not as simple as: this.message = message != null ? message.toString() : null; because of the implementation of org.apache.log4j.spi.LoggingEvent.getRenderedMessage() I do like turning the Object into a String in the constructor because you get a snapshot of the message in one place, not at some later time; and calling toString() can give you different values when called repeatedly. What about submitting a patch?
(In reply to comment #42) I didn't think exactly about getting a snapshot. I just thought defining "message" instance field as the type String would fix the bug merely because inside toString() method of the String class there isn't any Log4j call. The same idea applies when calling org.apache.log4j.spi.ThrowableInformation.getThrowableStrRep() method at the beginning, in the constructor. I believe I achieved the goal. This bug can be easily reproduced. See below. Create the following classes: com.a.AnObject com.b.AnException com.c.AnObjectThread com.d.AnExceptionThread com.main.RootLoggerThread com.main.Main ============== package com.a; import org.apache.log4j.Logger; public class AnObject { private static final Logger LOGGER = Logger.getLogger(AnObject.class); private final String name; public AnObject(String name) { this.name = name; } @Override public String toString() { try { Thread.sleep(4000); } catch (InterruptedException ex) { ex.printStackTrace(); } LOGGER.info("Logging DEBUG in AnObject [" + name + "]"); return name; } } ============== ============== package com.b; import org.apache.log4j.Logger; public class AnException extends RuntimeException { private static final Logger LOGGER = Logger.getLogger(AnException.class); public AnException() { super(); } public AnException(String msg) { super(msg); } public AnException(Throwable t) { super(t); } public AnException(String msg, Throwable t) { super(msg, t); } @Override public String getMessage() { try { Thread.sleep(4000); } catch (InterruptedException ex) { ex.printStackTrace(); } LOGGER.error("Logging ERROR in AnException"); return super.getMessage(); } } ============== ============== package com.c; import com.a.AnObject; import org.apache.log4j.Logger; public class AnObjectThread extends Thread { private static final Logger LOGGER = Logger.getLogger(AnObjectThread.class); public AnObjectThread(String threadName) { super(threadName); } @Override public void run() { AnObject anObject = new AnObject("Object created in AnObjectThread"); LOGGER.info(anObject); } } ============== ============== package com.d; import com.b.AnException; import org.apache.log4j.Logger; public class AnExceptionThread extends Thread { private static final Logger LOGGER = Logger.getLogger(AnExceptionThread.class); public AnExceptionThread(String threadName) { super(threadName); } @Override public void run() { LOGGER.info("Just logging INFO in AnExceptionThread", new AnException("test exception", new AnException("cause exception"))); } } ============== ============== package com.main; import org.apache.log4j.Logger; public class RootLoggerThread extends Thread { private static final Logger LOGGER = Logger.getLogger(RootLoggerThread.class); public RootLoggerThread(String threadName) { super(threadName); } @Override public void run() { LOGGER.info("Just logging INFO in RootLoggerThread"); } } ============== ============== package com.main; import com.c.AnObjectThread; import com.d.AnExceptionThread; /** * Bug 41214 reproduction. * * https://issues.apache.org/bugzilla/show_bug.cgi?id=41214 * * @author Marcelo S. Miashiro */ public class Main { public static void main(String[] args) { RootLoggerThread rootLoggerThread = new RootLoggerThread("RootLoggerThread"); AnObjectThread anObjectThread = new AnObjectThread("AnObjectThread"); AnExceptionThread anExceptionThread = new AnExceptionThread("AnExceptionThread"); anExceptionThread.start(); anObjectThread.start(); try { // To reproduce the bug, com.a.AnObject.toString() and com.b.AnException.getMessage() // methods must be called before rootLogger Thread.sleep(2000); } catch (InterruptedException ex) { ex.printStackTrace(); } rootLoggerThread.start(); } } ============== And, of course, create the log4j.properties file with the following content: ============== log4j.rootLogger=INFO, ConsoleAppender log4j.logger.com.c=INFO, ConsoleAppender log4j.logger.com.d=INFO, ConsoleAppender log4j.appender.ConsoleAppender=org.apache.log4j.ConsoleAppender log4j.appender.ConsoleAppender.layout=org.apache.log4j.PatternLayout log4j.appender.ConsoleAppender.layout.ConversionPattern=[%d{dd-MM-yyyy HH:mm:ss,SSS}][%-5p][%c] %M : %m%n ============== Try running the Main class with log4j version 1.2.17 firstly without the fix I suggested. You see a deadlock happening. Try using jconsole to see what's going on. You'll see in jconsole something like this: ============== Name: AnExceptionThread State: BLOCKED on org.apache.log4j.ConsoleAppender@13e75a5 owned by: AnObjectThread Total blocked: 2 Total waited: 0 Stack trace: org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:231) org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) org.apache.log4j.Category.callAppenders(Category.java:206) - locked org.apache.log4j.Logger@147e668 org.apache.log4j.Category.forcedLog(Category.java:391) org.apache.log4j.Category.info(Category.java:683) com.d.AnExceptionThread.run(AnExceptionThread.java:19) ============== ============== Name: AnObjectThread State: BLOCKED on org.apache.log4j.spi.RootLogger@157b46f owned by: RootLoggerThread Total blocked: 1 Total waited: 1 Stack trace: org.apache.log4j.Category.callAppenders(Category.java:205) org.apache.log4j.Category.forcedLog(Category.java:391) org.apache.log4j.Category.info(Category.java:666) com.a.AnObject.toString(AnObject.java:24) org.apache.log4j.or.DefaultRenderer.doRender(DefaultRenderer.java:37) org.apache.log4j.or.RendererMap.findAndRender(RendererMap.java:80) org.apache.log4j.spi.LoggingEvent.getRenderedMessage(LoggingEvent.java:368) org.apache.log4j.helpers.PatternParser$BasicPatternConverter.convert(PatternParser.java:402) org.apache.log4j.helpers.PatternConverter.format(PatternConverter.java:65) org.apache.log4j.PatternLayout.format(PatternLayout.java:506) org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:310) org.apache.log4j.WriterAppender.append(WriterAppender.java:162) org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251) - locked org.apache.log4j.ConsoleAppender@13e75a5 org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) org.apache.log4j.Category.callAppenders(Category.java:206) - locked org.apache.log4j.Logger@12c4c57 org.apache.log4j.Category.forcedLog(Category.java:391) org.apache.log4j.Category.info(Category.java:666) com.c.AnObjectThread.run(AnObjectThread.java:20) ============== ============== Name: RootLoggerThread State: BLOCKED on org.apache.log4j.ConsoleAppender@13e75a5 owned by: AnObjectThread Total blocked: 1 Total waited: 0 Stack trace: org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:231) org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) org.apache.log4j.Category.callAppenders(Category.java:206) - locked org.apache.log4j.spi.RootLogger@157b46f org.apache.log4j.Category.forcedLog(Category.java:391) org.apache.log4j.Category.info(Category.java:666) com.main.RootLoggerThread.run(RootLoggerThread.java:18) ============== After applying the fix I suggested, no more deadlock happens. I have already submitted the patch to Log4j team.
Created attachment 29064 [details] Log4j bug deadlock fix and reproduction (Bug 41214) Hi Apache Log4j dev team, I am attaching the Bug41214.zip file. Inside this file there are two other zip files: Bug41214_Reproduction.zip and Log4j_1.2.17_Bug41214_Fix.zip. I am submitting a patch for bug 41214 (https://issues.apache.org/bugzilla/show_bug.cgi?id=41214), in the Log4j_1.2.17_Bug41214_Fix.zip attached file. Please, read below my explanation: Actually, I think the solution (or a mitigation) might be simple. I suggest the following "workaround" modifications, in the Log4J version 1.2.17 (the last version): 1)In the org.apache.log4j.spi.LoggingEvent class, inside all the constructors: Replace: this.message = message; By: this.message = message != null ? message.toString() : null; Doing so, you can make the "message" instance field always of the type String. 2)In the org.apache.log4j.spi.ThrowableInformation class: 2.1)Create the following instance field: private final String[] stackTraceArray; And, inside all the constructors, put the following piece of code: this.stackTraceArray = getThrowableStrRep(); The intention here is using "stackTraceArray" instance field instead of calling getThrowableStrRep() later. 2.2)Make getThrowableStrRep() private. Doing so, there is no need to maintain the synchronization of this method. Actually, I think there would be no need of synchronizing this method even if it were public, because I didn't see any place in Log4j source codes where an instance of ThrowableInformation is being shared by more than one thread. I do not understand what was the intention of the developer that created that method, maybe he/she predicted a future necessity of synchronization, as ThrowableInformation is public and has public constructors. We never know the sort of things the other programmers will do with our public classes and methods (the "bug in Log4j" we are discussing here shows exactly such a situation, doesn't it?) 2.3)Create a public method called "getStackTraceArray()" that returns "stackTraceArray" instance field (or we can also return a clone of this array). 3)In all the places where getThrowableStrRep() is being called, replace by getStackTraceArray() invocation. This replacement should be done in the following classes: org.apache.log4j.pattern.ThrowableInformationPatternConverter org.apache.log4j.pattern.LogEvent org.apache.log4j.lf5.Log4JLogRecord I do not think it is related just to RollingFileAppender, as the title of this page suggests. That's just a synchronization among threads problem, not diretly related to any appender. The idea behind this solution is based on the fact that inside toString() method of the String class there isn't any Log4j call. Trying to understand who is the "guilty" of the problem (whether is Log4j or the application/framework that uses Log4j) is not so important. What is revelant is the fact that we have a big serious problem that has to be solved. Otherwise, we can never rely on a multithreaded application subject to a Log4j deadlock. If the solution I proposed doesn't solve the problem, I will really consider the alternative of removing Log4j from my application. I will sadly do that, although I like Log4j so much. But I tested my solution, and I think it works and covers all the critical cases. Of course, as I said, this solution might be considered just a mitigation, not a real solution. In my opinion, a real and elegant solution would be done through a more deep refactoring. I also want to tell that this problem really happens, this is not just theoretical. I have a multithreaded application that uses Apache Mina framework for exchanging bytes with another multithreaded application. Sometimes the application simply stops working. I didn't understand why some threads remain BLOCKED until I have found the web page of the bug 41214. This web page was my salvation. In the Bug41214_Reproduction.zip attached file, you have the source codes of an executable that reproduces the bug. First, try running this executable with log4j version 1.2.17 without the fix suggested by me. You see a deadlock happening. Try using jconsole to see what's going on. You'll see in jconsole something like this: ============== Name: AnExceptionThread State: BLOCKED on org.apache.log4j.ConsoleAppender@13e75a5 owned by: AnObjectThread Total blocked: 2 Total waited: 0 Stack trace: org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:231) org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) org.apache.log4j.Category.callAppenders(Category.java:206) - locked org.apache.log4j.Logger@147e668 org.apache.log4j.Category.forcedLog(Category.java:391) org.apache.log4j.Category.info(Category.java:683) com.d.AnExceptionThread.run(AnExceptionThread.java:19) ============== ============== Name: AnObjectThread State: BLOCKED on org.apache.log4j.spi.RootLogger@157b46f owned by: RootLoggerThread Total blocked: 1 Total waited: 1 Stack trace: org.apache.log4j.Category.callAppenders(Category.java:205) org.apache.log4j.Category.forcedLog(Category.java:391) org.apache.log4j.Category.info(Category.java:666) com.a.AnObject.toString(AnObject.java:24) org.apache.log4j.or.DefaultRenderer.doRender(DefaultRenderer.java:37) org.apache.log4j.or.RendererMap.findAndRender(RendererMap.java:80) org.apache.log4j.spi.LoggingEvent.getRenderedMessage(LoggingEvent.java:368) org.apache.log4j.helpers.PatternParser$BasicPatternConverter.convert(PatternParser.java:402) org.apache.log4j.helpers.PatternConverter.format(PatternConverter.java:65) org.apache.log4j.PatternLayout.format(PatternLayout.java:506) org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:310) org.apache.log4j.WriterAppender.append(WriterAppender.java:162) org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251) - locked org.apache.log4j.ConsoleAppender@13e75a5 org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) org.apache.log4j.Category.callAppenders(Category.java:206) - locked org.apache.log4j.Logger@12c4c57 org.apache.log4j.Category.forcedLog(Category.java:391) org.apache.log4j.Category.info(Category.java:666) com.c.AnObjectThread.run(AnObjectThread.java:20) ============== ============== Name: RootLoggerThread State: BLOCKED on org.apache.log4j.ConsoleAppender@13e75a5 owned by: AnObjectThread Total blocked: 1 Total waited: 0 Stack trace: org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:231) org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) org.apache.log4j.Category.callAppenders(Category.java:206) - locked org.apache.log4j.spi.RootLogger@157b46f org.apache.log4j.Category.forcedLog(Category.java:391) org.apache.log4j.Category.info(Category.java:666) com.main.RootLoggerThread.run(RootLoggerThread.java:18) After applying the fix I suggest, no more deadlock happens. ============== Thanks in advance Marcelo Sussumu Miashiro.
I too like the snapshotting aspects of this solution, unfortunately some people use toString to implement expensive pretty printing calls. You wouldn't want to call toString() eagerly on a debug() level call, only to have it be dropped on the floor. (In reply to comment #41) > (In reply to comment #35) > > The Category.callAppenders() method makes its way up the Logger hierarchy > > (log4j Logger hierarchy knowledge needed). On each iteration the Logger is > > locked with a synchronized block. Once in the synchronized block, if an > > AppenderAttachableImpl exists on the current Logger, it calls > > AppenderAttachableImpl.appendLoopOnAppenders(LoggingEvent). > > public class Category implements AppenderAttachable { > > public void callAppenders(LoggingEvent event) { > > int writes = 0; > > > > for(Category c = this; c != null; c=c.parent) { > > // Protected against simultaneous call to addAppender, > > removeAppender,... > > synchronized(c) { > > if(c.aai != null) { > > writes += c.aai.appendLoopOnAppenders(event); > > } > > if(!c.additive) { > > break; > > } > > } > > } > > > > if(writes == 0) { > > repository.emitNoAppenderWarning(this); > > } > > } > > } > > > > In AppenderAttachableImpl.appendLoopOnAppenders(LoggingEvent), the method > > Appender.doAppend(LoggingEvent) is called. > > public class AppenderAttachableImpl implements AppenderAttachable { > > public int appendLoopOnAppenders(LoggingEvent event) { > > int size = 0; > > Appender appender; > > > > if(appenderList != null) { > > size = appenderList.size(); > > for(int i = 0; i < size; i++) { > > appender = (Appender) appenderList.elementAt(i); > > appender.doAppend(event); > > } > > } > > return size; > > } > > } > > > > Finally, the RollingFileAppender indirectly implementes Appender via the > > indirectly extended AppenderSkeleton class. > > AppenderSkeleton.doAppend(LoggingEvent) is a synchonized method, so calling > > RollingFileAppender.doAppend(LoggingEvent) will lock the current > > RollingFileAppender object. > > public abstract class AppenderSkeleton implements Appender, OptionHandler { > > public synchronized void doAppend(LoggingEvent event) { > > ... > > } > > } > > > > > > > > What follows is what I found was happening. > > We have the following Logger hierarchy: > > > > > > RootLogger > > +- com.company - RollingFileAppender attached > > +- com.company.product.jdbc.MyJDBC > > (ExecuteThread: '4') > > +- com.company.product.datatypes.BusinessException > > (ExecuteThread: '0') > > +- framework.core.ejb.connectivity.BaseMessageDrivenBean > > (ExecuteThread: '0') - RollingFileAppender attached > > > > > > I think the sequence of event is something like this: > > 1. ExecuteThread: '0' – is logging on the BaseMessageDrivenBean logger with > > a BusinessException (which implements toString()) > > 2. ExecuteThread: '0' - in Category.callAppenders(), it starts to make its > > way up in the Logger hierarchy (locking the current Logger on each > > iteration). > > 3. ExecuteThread: '0' - it reaches and locks the > > "framework.core.ejb.connectivity.BaseMessageDrivenBean" Logger, where it > > finds a RollingFileAppender attached. It calls > > AppenderAttachableImpl.appendLoopOnAppenders(LoggingEvent), which ends up > > calling synchronized method doAppend(LoggingEvent) on a RollingFileAppender > > object, locking it at the same time. > > 4. ExecuteThread: '4' – is logging something on the MyJDBC logger > > 5. ExecuteThread: '4' - in Category.callAppenders(), it starts to make its > > way up in the Logger hierarchy (locking the current Logger on each > > iteration). > > 6. ExecuteThread: '4' - it reaches and locks the "com.company" Logger, where > > it finds a RollingFileAppender attached. It calls > > AppenderAttachableImpl.appendLoopOnAppenders(LoggingEvent), which ends up > > calling the synchronized method doAppend(LoggingEvent) on the _already > > locked_ RollingFileAppender object. *ExecuteThread: '4' is blocked waiting > > on ExecuteThread: '0'* > > 7. ExecuteThread: '0' – goes on with its execution inside the > > RollingFileAppender. This ends up logging the BusinessException, which > > indirectly calls BusinessException.toString(). Inside the toString() method > > come logging is called on the BusinessException Logger. > > 8. ExecuteThread: '0' - in Category.callAppenders(), it starts to make its > > way up in the Logger hierarchy (locking the current Logger on each > > iteration). > > 9. ExecuteThread: '0' - it reaches "com.company" and tries to lock it, but > > it is already locked by ExecuteThread: '4'. *ExecuteThread: '0' is blocked > > waiting on ExecuteThread: '4'* > > > > THIS IS A DEADLOCK SITUATION. > > > > This is not really a problem with log4j's itself. The problem lies more in > > how log4j is configured, and a complex sequence of method calls exhibiting a > > deadlock situation with java locks on synchronized blocks. Using the > > RollingFileAppender probably makes things worse: if a file rolling (very > > slow compared to running java code) is happening in this sequence of events, > > then it greatly increases the chances of reaching such problem. > > Actually, I think the solution (or a mitigation) might be simple. I suggest > the following "workaround" modifications, in the Log4J version 1.2.17 (the > last version): > > 1)In the org.apache.log4j.spi.LoggingEvent class, inside all the > constructor's: > > Replace: > this.message = message; > > By: > this.message = message != null ? message.toString() : null; > > Doing so, you can make the "message" instance field always of the type > String. > >
I can see the deadlock from here. With nested Categories A->B->C While you're in B.callAppenders B is locked. If toObject() attempts to log, then C.callAppenders is invoked again. If someone else has already called C.callAppenders, you have a deadlock. One has C waiting for B, another has B waiting for C. The classical solution to this problem is to sort your locks and grab them in the same predictable order. The parent-child relationship -almost- solves the problem here, except for a couple problems. One, being able to rearrange your loggers while logging is happening, and the other is that pesky toObject() method. What you probably should do here is call parent.callAppenders while still holding your own lock. With deeply nested Categories this will widen your bottleneck but eliminate the deadlock. I won't be able to write to c.log while someone is still writing to b.log or a.log, and unfortunately it also breaks your emitNoAppenderWarning fallback, because callAppenders returns void. It's unclear to my why the parentage of a Category can be updated after it's created. It seems like it's an artifact of the evolution of the library. I'm not familiar with this aspect of Log4j but it seems that this is a case of premature publication. Add on a sprinkle of too much delegation and a dash of concurrency and you've got quite a soup here. I thought I had a patch proposal (that didn't break the API) but it turned out on further inspection to have the exact same failure mode. So instead of my help all I can offer is my sympathy. I now understand why this has been open for so long. I do think that if you pull some of the deprecated things out of this part of the API that it may be easier to fix.
After discussion with some colleagues, I would like to propose the following fix: You cannot protect AppenderAttachableImpl from here. You can only protect the lazy instantiation. All of the public methods on AppenderAttachableImpl should be marked synchronized if you're worried about add and remove calls causing problems. Category c = this while (c != null) { Category parent; boolean additive; AppenderAttachableImpl aai; // Protected against simultaneous call to addAppender, removeAppender,... synchronized(c) { aai = c.aai; additive = c.additive; c = c.parent(); } if(aai != null) { writes += aai.appendLoopOnAppenders(event); } if(!additive) { break; } } if(writes == 0) { repository.emitNoAppenderWarning(this); } } With this code, no lock is held on the heirarchy while toString can be called, and no two locks are held at once, breaking the deadlock race condition. In-flight log messages can still show up in the old log while structural changes are being made, but I believe that was always a danger anyway. Thoughts? (In reply to comment #46) > I can see the deadlock from here. > > With nested Categories A->B->C > > While you're in B.callAppenders B is locked. If toObject() attempts to log, > then C.callAppenders is invoked again. If someone else has already called > C.callAppenders, you have a deadlock. One has C waiting for B, another has > B waiting for C. > > The classical solution to this problem is to sort your locks and grab them > in the same predictable order. The parent-child relationship -almost- > solves the problem here, except for a couple problems. One, being able to > rearrange your loggers while logging is happening, and the other is that > pesky toObject() method. > > What you probably should do here is call parent.callAppenders while still > holding your own lock. With deeply nested Categories this will widen your > bottleneck but eliminate the deadlock. I won't be able to write to c.log > while someone is still writing to b.log or a.log, and unfortunately it also > breaks your emitNoAppenderWarning fallback, because callAppenders returns > void. > > It's unclear to my why the parentage of a Category can be updated after it's > created. It seems like it's an artifact of the evolution of the library. > I'm not familiar with this aspect of Log4j but it seems that this is a case > of premature publication. Add on a sprinkle of too much delegation and a > dash of concurrency and you've got quite a soup here. > > I thought I had a patch proposal (that didn't break the API) but it turned > out on further inspection to have the exact same failure mode. So instead > of my help all I can offer is my sympathy. I now understand why this has > been open for so long. > > I do think that if you pull some of the deprecated things out of this part > of the API that it may be easier to fix.
Since you asked for thoughts.... Please test Log4j 2.
(In reply to comment #47) > synchronized(c) { > aai = c.aai; > additive = c.additive; > c = c.parent(); > } > > if(aai != null) { > writes += aai.appendLoopOnAppenders(event); > } The code above is risky. The reason for that synchronized block is lost. That code doesn´t protect against a simultaneous call of aai.appendLoopOnAppenders() from two or more threads. You are throwing away the main purpose of that synchronization. The solution proposed in Comment 37 works pretty well, but unfortunately it depends on JRE 1.5 or later. The solution proposed by me also works, even using JRE 1.4.
(In reply to comment #48) > Since you asked for thoughts.... Please test Log4j 2. Yes, the bug presented in this page is solved in log4j version 2. I tested using log4j-2.0-beta2 and no more deadlock happens. But, unfortunately (or, FORTUNATELY, it depends on the point of view), this version of log4j requires JDK/JRE 1.5 or later.
We are getting Stuck threads on Windows 2003 environment. We are using version 1.2.14 of log4j. 2013-07-08 15:35:35,098] INFO : Size: 0; Questions: 113; Hits: 19; Adaptive r/w ratio: 1%; Total hitrate: 17% <08-Jul-2013 15:35:36 o'clock BST> <Error> <WebLogicServer> <BEA-000337> <[STUCK] ExecuteThread: '140' for queue: 'weblogic.kernel.Default (self-tuning)' has been busy for "627" seconds working on the request "Http Request: /NewsBroker/news.rss", which is more than the configured time (StuckThreadMaxTime) of "600" seconds. Stack trace: org.apache.log4j.Category.callAppenders(Category.java:201) org.apache.log4j.Category.forcedLog(Category.java:388) org.apache.log4j.Category.error(Category.java:319) com.astrazeneca.portal.rss.TransformerUtils.getChannels(TransformerUtils.java:154) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) java.lang.reflect.Method.invoke(Method.java:592) org.apache.xalan.extensions.ExtensionHandlerJavaPackage.callFunction(ExtensionHandlerJavaPackage.java:332) org.apache.xalan.extensions.ExtensionHandlerJavaPackage.callFunction(ExtensionHandlerJavaPackage.java:426) org.apache.xalan.extensions.ExtensionsTable.extFunction(ExtensionsTable.java:220) We get hundreds of similar stuck threads daily. Is there any workaround for this issue? Thanks, Nishant
(In reply to nishant.patralekh from comment #51) > We are getting Stuck threads on Windows 2003 environment. We are using > version 1.2.14 of log4j. > ... > We get hundreds of similar stuck threads daily. > Is there any workaround for this issue? The best solution for this problem is to upgrade to log4j 2.0. 1.2.14 is extremely old and log4j 2.0 does address a lot of the multi-threading issues.
Thanks for such a quick response. Where can I check if there is any compatitibility issue in log4j 2.0 ? Currently we are ussing jdk150_04. Also, from where can we get the jar from? Thanks, Nishant
I upgraded to log4j2 and still it was jamming up for ten to twenty seconds, directly after midnight when rollover and compression happens. One thread is java.lang.Thread.State: RUNNABLE at java.io.RandomAccessFile.writeBytes(Native Method) at java.io.RandomAccessFile.write(RandomAccessFile.java:499) at org.apache.logging.log4j.core.appender.rolling.FastRollingFileManager.flush(FastRollingFileManager.java:105) at org.apache.logging.log4j.core.appender.rolling.FastRollingFileManager.write(FastRollingFileManager.java:89) - locked <0x00000000e81ae1e0> (a org.apache.logging.log4j.core.appender.rolling.FastRollingFileManager) at org.apache.logging.log4j.core.appender.OutputStreamManager.write(OutputStreamManager.java:129) at org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputStreamAppender.java:115) at org.apache.logging.log4j.core.appender.FastRollingFileAppender.append(FastRollingFileAppender.java:97) at org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:102) at org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:426) at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:407) at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:410) at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:368) at org.apache.logging.log4j.core.Logger.log(Logger.java:110) at org.apache.logging.log4j.spi.AbstractLoggerWrapper.log(AbstractLoggerWrapper.java:55) at org.apache.logging.log4j.spi.AbstractLogger.info(AbstractLogger.java:984) and then scores of other threads at java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.logging.log4j.core.appender.rolling.RollingFileManager.checkRollover(RollingFileManager.java:108) - waiting to lock <0x00000000e81ae1e0> (a org.apache.logging.log4j.core.appender.rolling.FastRollingFileManager) at org.apache.logging.log4j.core.appender.FastRollingFileAppender.append(FastRollingFileAppender.java:88) at org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:102) at org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:426) at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:407) at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:410) at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:368) at org.apache.logging.log4j.core.Logger.log(Logger.java:110) at org.apache.logging.log4j.spi.AbstractLoggerWrapper.log(AbstractLoggerWrapper.java:55) at org.apache.logging.log4j.spi.AbstractLogger.info(AbstractLogger.java:984) Changing to disruptor fixed the problem (async appenders might have fixed the problem too i suspect).
@mck@apache.org: Please note: the LOG4j2 issues are stored here: https://issues.apache.org/jira/browse/LOG4J2 I highly encourage you to open an issue there.
Created attachment 31193 [details] Updated version of log4j 1.2.6 with reentry lock An updated version for Log4j 1.2.6, this version covers additional spot missed in earlier patch. This is stress tested with lots jvm threads to ensure log4j 1.x series doesn't deadlock on high stressed environment.
Hi people. I have the same issue with log4j 1.2.17. Any update on this? Is there any ETA Thanks!
I'm afraid you will not get much more than our stock answer here: Our energy is focused on Log4j 2, and in particular into getting our next release, 2.3, out the door, hopefully in a week or two. For v1, there are no current plans to release a version 1.2.18.