Bug 50486 - Memoryleak - org.apache.log4j.helpers.ThreadLocalMap
Memoryleak - org.apache.log4j.helpers.ThreadLocalMap
Status: RESOLVED FIXED
Product: Log4j
Classification: Unclassified
Component: Other
1.2
PC All
: P2 major
: ---
Assigned To: log4j-dev
:
Depends on:
Blocks:
  Show dependency tree
 
Reported: 2010-12-16 10:33 UTC by kuku
Modified: 2014-02-17 13:48 UTC (History)
10 users (show)



Attachments
submit a patch for memory leak in application server (483 bytes, patch)
2011-12-20 15:03 UTC, Yingchun Chen
Details | Diff
clear the MDC when last key is removed + testcase (3.48 KB, patch)
2012-03-07 22:04 UTC, Maarten Bosteels
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description kuku 2010-12-16 10:33:00 UTC
Tomcat 7 log's this when doing a shutdown:

The web application [/MyApp] created a ThreadLocal with key of type [org.apache.log4j.helpers.ThreadLocalMap] (value [org.apache.log4j.helpers.ThreadLocalMap@1d3e069]) and a value of type [java.util.Hashtable] (value [{}]) but failed to remove it when the web application was stopped. This is very likely to create a memory leak.

I'm using it the following way:

try {
  if (MDC.get(..) == null) {
    MDC.put(.., ...);
  }
   ....
  } finally {
     MDC.remove(...);
  }

After taking a quick look at the source of log4j i think the problem is that remove only removes the entry from the HashTable but does not call remove on the ThreadLocal itself.
Comment 1 chahuistle 2011-02-09 10:52:34 UTC
I had the same problem. However, in one of tomcat's filters, I make sure that, for each request, the MDC gets cleared.

public final void doFilter(...) throws IOException, ServletException {
    MDC.clear();
}

When I shutdown tomcat, more often than not, I can see the same messages on the console log. I agree on what kuku said. The code that is "clearing" MDC looks like:

private
  void clear0() {
    if(!java1 && tlm != null) {
      Hashtable ht = (Hashtable) ((ThreadLocalMap)tlm).get();
      if(ht != null) {
        ht.clear();
      } 
    }
  }

I think the following code would fix it:

private
  void clear0() {
    if(!java1 && tlm != null) {
      Hashtable ht = (Hashtable) ((ThreadLocalMap)tlm).get();
      if(ht != null) {
        ht.clear();
      }
      // clear the thread local map 
      ((ThreadLocalMap)tlm).remove();
    }
  }
Comment 2 marcel 2011-03-18 06:29:30 UTC
I raised this issue before on the dev list: http://www.mail-archive.com/log4j-dev@logging.apache.org/msg11378.html
Comment 3 David North 2011-09-20 11:29:05 UTC
Any prospect of this being fixed any time soon? If I submitted a patch based on calling the ThreadLocal remove() method (by reflection for JDK1.4 compatibility), would it be accepted?
Comment 4 nestor.urquiza 2011-10-03 16:38:28 UTC
Tomcat 7 logs the problem as SEVERE. That is actually good as this is probably a problem affecting Tomcat 6 with memory leaks.

I vote for a fix on this.

Thanks!
-Nestor
Comment 5 R. Oosterholt 2011-11-03 15:05:21 UTC
I've tested the suggested patch by chahuistle (on 2011-02-09) and it works like a charm.

I vote for a fix on this.

Thanks!
-Rick
Comment 6 Yingchun Chen 2011-12-20 15:03:20 UTC
Created attachment 28091 [details]
submit a patch for memory leak in application server

This patch file just contains the "((ThreadLocalMap)tlm).remove();" to fix the problem below:

SEVERE: The web application [/my_app] created a ThreadLocal with key of type [org.apache.log4j.helpers.ThreadLocalMap] (value [org.apache.log4j.helpers.ThreadLocalMap@568614]) and a value of type [java.util.Hashtable] (value [{userId=:unknown:, monitorID=0000000000000000}]) but failed to remove it when the web application was stopped. This is very likely to create a memory leak.
Dec 19, 2011 12:02:09 AM org.apache.catalina.loader.WebappClassLoader clearThreadLocalMap
Comment 7 grobmeier 2012-01-13 22:41:50 UTC
applied with rv1231361, will come with 1.2.17.

Thanks for the patch!
Comment 8 Ralph Goers 2012-01-14 20:07:59 UTC
This patch needs to be reverted or modified. The patch as applied requires Java 5 and will throw exceptions in earlier versions. Log4j 1.x supports very old JDKs.
Comment 9 f.bantner 2012-01-25 17:23:23 UTC
I'm monitoring this report for quiet some time now and I don't understand the problem why it is not fixed.

This is one of the major sources of memory leaks in tomcat which bugges every developer for quiet some time now.

The fix would be something like a 3-liner and is a complete no-brainer: simply call the remove method via reflections if it exists. 

So here is the "patch":

try {
  ThreadLocalMap.class.getMethod( "remove" ).invoke( tlm );
} catch( NoSuchMethodException e ){
  // Do nothing
}

This won't fix the bug for the 1.4 people but for the 90% of the rest of us which aren't bound to old problems anyway.

So please please fix this.
Comment 10 marcel 2012-01-25 18:52:46 UTC
Is there a particular reason I can't vote for this bug (as I usually can in bugzilla)? I guess votes would be through the roof if voting was activated/allowed...
Do you (log4j devs) really want all of us affected user to create our own custom log4j version just to incorporate this fix?
Comment 11 Ralph Goers 2012-01-25 21:32:07 UTC
The patch was modified by Christian on Jan 17 so that it won't break old JDKs. You should be able to get the source from subversion and use it.
Comment 12 marcel 2012-01-25 22:10:41 UTC
Ok, the patch was accepted, modified and applied I get that.
a) Shouldn't this ticket be marked as 'resolved'?
b) What's the ETA for 1.2.17?

Also, I believe there's a second issue. The way it's implemented right know clients must explicitly call MDC.clear(). Couldn't/shouldn't remove() call clear() internally once the last item was removed()?
How do clients know when to call clear()? In a request processing chain (e.g. Servlet filter) several parties may write items and each one of them is responsible to call remove() for the items it added. On the way it items are added, on the way out they're removed in reverse order. And the last guy closes the door i.e. calls clear(). However, how does a client know that he's the last?
Comment 13 grobmeier 2012-01-25 22:36:07 UTC
Hello Marcel,

1) yes- but lets wait if other devs have comments to your questions below. I reposted them to the ml

2) I think the release should happen immediately after we agree on this fix. I have volunteered to make this release, but haven't done this before. I cannot give you a date, but I will try to release a few days after your questions are answered and after everybody agrees to the content of 1.2.17

Cheers
Christian
Comment 14 Patric Rufflar 2012-02-23 14:17:13 UTC
One month passed - any news for the 1.2.17 release?
Comment 15 Maarten Bosteels 2012-03-07 22:04:33 UTC
Created attachment 28436 [details]
clear the MDC when last key is removed + testcase

I agree that remove() should call clear when the last key was removed for the reason described by marcel@frightanic.com 

I have attached a trivial fix and a small test-case with a simplified version of the code that tomcat uses to check for memory leaks.
Comment 16 marcel 2012-03-08 07:52:19 UTC
Ok, what else is still missing before 1.2.17 will see daylight?
Comment 17 grobmeier 2012-03-08 10:49:28 UTC
I will give it a next try end of next week.
Comment 18 grobmeier 2012-03-08 10:54:42 UTC
@Maarten Bosteels: I looked at this patch, feels good to me. Will include it with 1.2.17 (except others have objections).

Cheers + Thanks!
Comment 19 Jim O'Hara 2012-04-19 15:09:26 UTC
Maarteen,

I made your change in MDC.java, recompiled the code with the build.xml and created new log4j.jar and still get errors in a Struts 2 application running on Tomcat:

created a ThreadLocal with key of type [org.apache.log4j.helpers.ThreadLocalMap] (value [org.apache.log4j.helpers.ThreadLocalMap@13c339f]) and a value of type [java.util.Hashtable] (value [{userId=jo2372}]) but failed to remove it when the web application was stopped. This is very likely to create a memory leak.
Comment 20 Gary Gregory 2012-04-19 16:59:24 UTC
Do we still need org.apache.log4j.helpers.Loader.java1? It looks to be for Java 1.1, right?

It seems to add a lot code in the area we are discussing.
Comment 21 grobmeier 2012-04-20 09:48:02 UTC
We have vote recently to drop jdk 1.3 - means, no, we don't need to take care on Java 1.1 anymore (thanks heaven). 

This also makes the patch easier, because in the current version I took care on jdk 1.3 users.
Comment 22 grobmeier 2012-04-30 15:07:35 UTC
Ehm, before remove java1 we should check if problems occur with java 1.4. 
I just saw the "ThreadLocalMap" class does offer "remove" only with jdk5.
Comment 23 grobmeier 2012-05-03 11:31:49 UTC
fixed with 1.2.17
Comment 24 marcel 2012-05-03 11:51:41 UTC
Thanks! Would anyone be so kind and push this to Maven central?
I see the release is done (http://svn.apache.org/viewvc/logging/log4j/tags/log4j-1.2.17/) but neither http://logging.apache.org/log4j/1.2/download.html nor Maven central lists 1.2.17 yet.
Comment 25 grobmeier 2012-05-03 11:58:46 UTC
Hello Marcel,

at the ASF we need to vote on our releases. I have create a first RC candidate before 2 days, but need to create another one today. Then add 72 hours voting time to it. After that I can push it to maven. So sorry, it needs another little while, but we are close :-)

We will send an announcement to our users mailinglist and the dev list - feel free to join :-). In addition we will announce it on our blog:
http://blogs.apache.org/logging/

Cheers
Christian
Comment 26 john caron 2012-05-14 19:46:32 UTC
Hi: any word on this release? Would love to cross this problem off my list. Thanks for your hard work! John
Comment 27 marcel 2012-05-14 19:49:32 UTC
(In reply to comment #26)
> Hi: any word on this release? Would love to cross this problem off my list.
> Thanks for your hard work! John

You need to follow the thread "[VOTE] Release log4j 1.2.17 RC3" at http://mail-archives.apache.org/mod_mbox/logging-log4j-dev/201205.mbox/browser
Comment 28 Mattias Ordboken 2012-05-23 20:52:47 UTC
It’s almost frightening how long it takes for such a popular and widely used project to get a new release out.

It’s even more alarming that the reason for the delay is that they do not know how to build the release package anymore (see above mentioned thread at the mailing list for details).
Comment 29 grobmeier 2012-05-24 05:58:18 UTC
Hi Mattias,

the log4j build is highly complex due to windows dll building. Imagine a build system which has grown for the last 10 years or so. This is an issue which must be fixed soon. Actually there are people who know how to build it, but as we are a small team of volunteers we are in the bad situation that we still miss a required +1.

Once we have this release out I will put pressure on the refactoring of the log4j build system. It actually must become easier, otherwise it will become more and more difficult to release log4j1.

You are welcome to the project if you want to help us. Please join the log4j-dev mailinglist.

Cheers
Christian
Comment 30 grobmeier 2012-05-28 14:42:34 UTC
Hello all, 
1.2.17 is in the wild and can be downloaded now.
thanks for your patience
Cheers!
Comment 31 Jim O'Hara 2012-05-29 15:13:21 UTC
(In reply to comment #30)
> Hello all, 
> 1.2.17 is in the wild and can be downloaded now.
> thanks for your patience
> Cheers!

Are the updated files at:
http://logging.apache.org/log4j/1.2/download.html

Tried the jar, still getting ...[webapp] created a ThreadLocal with key of type [org.apache.log4j.helpers.ThreadLocalMap] (value [org.apache.log4j.helpers.ThreadLocalMap@f6af3b]) and a value of type [java.util.Hashtable] (value [{userId=jo2372}]) but failed to remove it when the web application was stopped. This is very likely to create a memory leak.
Comment 32 Gary Gregory 2012-05-29 18:00:48 UTC
I know this may be difficult to do, but can you provide a unit test so we can make sure we get it fixed for good?
Comment 33 Mattias Ordboken 2012-05-29 20:47:25 UTC
It took one and a half year to "fix" this bug, two months to get a release out and now the issue is still occurring?

I must be dreaming - this cannot be true! 

Has nobody tested this fix by just putting a simple war (which is putting some MDC value, logging only one line) to a fairly modern tomcat instance, start and stop the app and just took a look at the log files?

My feelings about this project is getting even worse.
Sorry for the rant...
Comment 34 Gary Gregory 2012-05-30 02:02:11 UTC
@Mattias Would you be willing to redirect you energy away from ranting ;) and towards patch writing? Unit tests are king. With a reproducible test case, we can try different solutions.
Comment 35 Ralph Goers 2012-05-30 03:09:43 UTC
I hate to ask the obvious, but was MDC.remove() called for all the items? I notice that the message Jim is getting still has a userId. The fix for this issue removes the underlying Hashtable only when all the MDC values are removed.
Comment 36 grobmeier 2012-05-30 05:05:15 UTC
Isn't that what clear() is for?

When somebody calls clear() then ThreadLocal.remove() is called which should remove the whole map. Imho on container shutdown MDC.clear() should be called. Therefore I would think 

1) Hierarchy.shutdown() should call MDC.clear()
2) MDC.clear should do this after clearing: MDC.mdc = new MDC();

Wdyt?
Comment 37 Ralph Goers 2012-05-30 05:26:23 UTC
Yes, that is what clear is for.

I don't see how Hierarchy.shutdown() can call MDC.clear(). Well, it could but it would only clear the thread the shutdown is performed on which isn't likely to help much.

Item 2 is likely to cause many problems.  MDC.clear only clears the current thread's ThreadLocal. Reinitializing the MDC will cause problems for all the other threads.
Comment 38 grobmeier 2012-05-30 06:43:52 UTC
Jim, it seems the usual way to hold your memory clean is to call the clear() method appropriate.

Ralph Goers explained on the mailinglist:

>> All the MDC is is a ThreadLocal where each Thread has its own Map associated
>> with the ThreadLocal. You can get and put all you want. In a webapp, of
>> course, if you add stuff to the MDC at the beginning of a request then you
>> need to remove it all at the end of the request.  The normal pattern is
>>
>> MDC.put(key, value);
>> try {
>>    call servlet
>> } finally {
>>   MDC.remove(key);
>> }

As far as I understand it, using the MDC.clear() method should help you in this case (with 1.2.17). In a servlet environment, you might want to put this code into the Servlet.destroy() method:

http://docs.oracle.com/javaee/6/api/javax/servlet/Servlet.html#destroy()

"This method gives the servlet an opportunity to clean up any
resources that are being held (for example, memory, file handles,
threads)"

Could you try out if that helps you? If it does, we'll provide accordingly documentation and finally close the issue afterwards. If it does not, we'll do some more bug hunting.

Thank you!
Comment 39 marcel 2012-05-30 06:52:09 UTC
In comment #12 I raised this issue before...
"Couldn't/shouldn't remove() call clear() internally once the last item was removed()?" For the reasons given in my comment (Servlet filter chain) you want to clear the MDC/ThreadLocal with each thread and not just when the Servlet/web app is destroyed.

With comment #15 a patch was supplied that promises to do just that?
Comment 40 grobmeier 2012-05-30 07:09:42 UTC
Hi Marcel,

this patch is actually applied (slightly modified):

http://svn.apache.org/repos/asf/logging/log4j/trunk/src/main/java/org/apache/log4j/MDC.java

 private void remove0(String key) {
        if (!java1 && tlm != null) {
            Hashtable ht = (Hashtable) ((ThreadLocalMap) tlm).get();
            if (ht != null) {
                ht.remove(key);
                // clean up if this was the last key
                if (ht.isEmpty()) {
                    clear0();
                }
            }
        }
    }


If the Hashtable is empty after key removal, the clear0() method is called.

Do you agree?
Comment 41 smarkwal 2012-05-30 07:13:01 UTC
(In reply to comment #38)
> [...] In a servlet environment, you might want to put
> this code into the Servlet.destroy() method:

I don't think that calling the clear method in the servlet's destroy method does the job. The destroy method may be invoked by a background thread of the application server and would remove the ThreadLocal value only from this background thread, not form the HTTP worker threads.

I agree with Marcel that a servlet filter is the best option:

public class ClearMDCFilter implements Filter {

	@Override
	public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {
		try {
			chain.doFilter(request, response);
		} finally {
			MDC.clear();
		}
	}

	[...]

}

This way, you don't have to care about removing all keys you set somewhere in your code.
Comment 42 marcel 2012-05-30 07:16:39 UTC
(In reply to comment #40)
> Hi Marcel,
> 
> this patch is actually applied (slightly modified):
> 
> http://svn.apache.org/repos/asf/logging/log4j/trunk/src/main/java/org/apache/
> log4j/MDC.java
> 
>  private void remove0(String key) {
>         if (!java1 && tlm != null) {
>             Hashtable ht = (Hashtable) ((ThreadLocalMap) tlm).get();
>             if (ht != null) {
>                 ht.remove(key);
>                 // clean up if this was the last key
>                 if (ht.isEmpty()) {
>                     clear0();
>                 }
>             }
>         }
>     }
> 
> 
> If the Hashtable is empty after key removal, the clear0() method is called.
> 
> Do you agree?

Yes, looks good to me. I just mentioned it again because calling clean() in Servlet#destroy() doesn't make sense in my opinion. In fact, clients shouldn't have to call clean() at all if they properly call remove() for each key they add.
Comment 43 marcel 2012-05-30 07:18:01 UTC
In reply to comment #42)
> Yes, looks good to me. I just mentioned it again because calling clean() in
> Servlet#destroy() doesn't make sense in my opinion. In fact, clients
> shouldn't have to call clean() at all if they properly call remove() for
> each key they add.

Sorry, make that clear() instead of clean().
Comment 44 grobmeier 2012-05-30 07:19:14 UTC
Thanks Stephan, I agree with you guys. I will use your code snippet for the FAQ.

Thanks Marcel for confirming the change.

Once the FAQ entry is ready, I will close this issue again with the appropriate link.
Comment 45 marcel 2012-05-30 07:32:29 UTC
(In reply to comment #44)
> Thanks Stephan, I agree with you guys. I will use your code snippet for the
> FAQ.

While Stephan's code certainly is correct (actually, log4j could provide that filter) and does the job I wouldn't be happy to see this mentioned prominently in the FAQ. 

You should make clear to users that if using Servlet filters the proper pattern is something like

try {
    MDC.put(myKey);
    chain.doFilter(request, response);
} finally {
    MDC.remove(myKey);
}

With Stephan's filter you have to be careful to configure it in the right position in web.xml (they're processed sequentially in configuration order). If the filter runs before the last piece of code that depends on the MDC is completed you're in trouble.
Comment 46 smarkwal 2012-05-30 07:36:27 UTC
I can confirm that the memory leak has been fixed for our product after upgrading to 1.2.17. I tested with Tomcat 7.0.26 running on an Oracle JVM 1.6.0_23 on Windows 7.

Thanks to all the people contributing to this fix! You made my day!
Comment 47 grobmeier 2012-05-30 08:58:20 UTC
Thank you Stephan for retesting and confirming the fix! It really helped!
Kudos also to Marcel for your help.

I just added some docs to the MDC javadoc and the FAQ. Please let me know if you would like to see some changes.

http://svn.apache.org/viewvc?view=revision&revision=1344155
Comment 48 Patric Rufflar 2012-05-30 09:14:32 UTC
What if you do not know which key/value pairs have been put to the MDC (if you think of 3rd party libraries, runtime dependent logic etc.)?

Is calling MDC.clear() at the end of doFilter() of a servlet filter sufficient?

I am really looking forward to some kind of Log4j shutdown method  (which should be called during servlet context shutdown) which ensures that all ThreadLocal objects have been removed.
Comment 49 grobmeier 2012-05-30 09:22:53 UTC
You can receive all the keys with doing:

MDC.getContext();

It returns a Hashmap. For each key in the Hashmap, call remove(key).

If you have ideas for such a shutdown() Method, please propose it on log4j-dev

Cheers!
Comment 50 Patric Rufflar 2012-05-30 09:45:14 UTC
@grobmeier

You missed a question in my last comment:
Is calling MDC.clear() at the end of doFilter() of a servlet filter sufficient?


IMHO (just took a quick look at the source), a shutdown() method should just set the field MDC.tlm to null (which references the ThreadLocalMap) - it should be a one-liner.
Comment 51 grobmeier 2012-05-30 10:45:49 UTC
@Patric:

yes I missed your question. Calling MDC.clear() does clear all values from the underlying hashtable and afterwards calls the remove method (if available, which is from java > 1.4). So yes, it should be enough.

Where should the shutdown method reside? In MDC? What is the benefit over clear()? The shutdown method would only affect the current thread btw - other threads would keep the tlm.
Comment 52 Patric Rufflar 2012-05-30 14:55:57 UTC
>Where should the shutdown method reside? In MDC? 

Within MDC would be ok, but for a matter of good design I would put this on an higher level, e.g. inside LogManager.shutdown()


>What is the benefit over clear()? 
>The shutdown method would only affect the current thread btw - other threads >would keep the tlm.

In fact, the opposite is true:
From a real-world perspective, all we need is one method which cares about all threads. This method would be called (for example) in the contextDestroyed() method of a web application and would ensure that the application will never leak again after it has been shut down.

IMHO, this would be a one-liner. (just set tlm to null - instead of calling tlm.remove()).
Comment 53 marcel 2012-05-30 17:56:23 UTC
(In reply to comment #52)
> From a real-world perspective, all we need is one method which cares about
> all threads.

I claim to live in the same real world as you but my perspective is still different. It's called /Thread/Local for a reason. The one thread that adds anything to MDC is responsible for removing it afterwards. If a thread doesn't clean up behind itself it leaves it up for someone else to do the dirty work - not nice.
What you propose is IMO nothing but a fallback for applications that don't adhere to that simple pattern. If log4j offered code for this it would actually discourage folks to use MDC the way it is supposed to be used.
Comment 54 Patric Rufflar 2012-05-30 19:21:40 UTC
>If a thread doesn't clean up behind itself it leaves it up for someone else to >do the dirty work - not nice.
I totally agree - that's way it should be.
Unfortunately this seems not clear for everybody and there will always be enough code around (e.g. 3rd party libraries) which violates this principle (just look at this bug - it took a long time to get log4j cleaning up the resources it acquired)

>What you propose is IMO nothing but a fallback for applications that don't adhere to that simple pattern.
Yes, that's true. But a failsafe which prevents a crash of a production server is better than insisting a perfect world.

And a shutdown procedure that has some effect on other threads is not uncommon - almost any shutdown() I am aware of concerns the outer world/other threads aswell (if you think of connection pools, application servers, operating systems...).
Comment 55 smarkwal 2012-05-31 08:56:26 UTC
I don't think that setting tlm = null on shutdown would make a big difference.

When your web application is undeployed, all classes loaded by the web app's classloader should get unloaded and GC'ed, including the log4j classes. Once the MDC class has been unloaded, the MDC singleton instance and the ThreadLocal instance referenced by it should get collected. And once the ThreadLocal instance is gone, all values stored for worker threads of the application server may get collected as well.

If you deploy log4j in a way that multiple web applications may share a common logger hierarchy, you don't want to call shutdown when you undeploy one of the web applications. A shutdown would affect the other web applications.

Or do I miss something?
Comment 56 Patric Rufflar 2012-05-31 09:44:59 UTC
>Or do I miss something?

I suppose you are.
I'll let some details out but I hope that this helps nevertheless:

The problem is that a ThreadLocal is always implicitly strongly referenced by the thread which it's tied to (until the thread dies).
Because most application servers used thread pools the thread will most probably remain alive.

As a result, a reference to the ThreadLocal is still kept.
If the ThreadLocal references an object of the web application, the web application classloader can also not be garbage collected and most of the memory of the (already shut down) application cannot be freed.

This is exactly the reason why tomcats memory leak detection mechanism is correctly issuing the warning (which this bug report is about).
Comment 57 grobmeier 2012-05-31 09:58:39 UTC
Folks,

I would like to suggest you to bring this discussion to the mailinglist: log4j-dev@logging.apache.org

At the moment this discussion gets only a low exposure as we are mainly discussing new features on the mailinglist (and not on closed issues :-))

Cheers!
Christian
Comment 58 f.bantner 2012-07-16 07:32:40 UTC
Thanks very much for this fix.

Now I have to get slf4j to implement this, too and then I can check another point on my todo: http://bugzilla.slf4j.org/show_bug.cgi?id=266