Bug 47744 - Memory leak when using SSL + Java security manager
Summary: Memory leak when using SSL + Java security manager
Status: RESOLVED FIXED
Alias: None
Product: Tomcat 5
Classification: Unclassified
Component: Connector:Coyote (show other bugs)
Version: 5.5.28
Hardware: PC Linux
: P2 major (vote)
Target Milestone: ---
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-08-26 11:01 UTC by Greg Vanore
Modified: 2010-01-27 01:20 UTC (History)
0 users



Attachments
The zip described in the main bug. Contains test client, JSP for the server, and patch. (4.38 KB, application/x-zip-compressed)
2009-08-26 11:01 UTC, Greg Vanore
Details
Patch for this issue. (1.43 KB, patch)
2009-08-26 11:02 UTC, Greg Vanore
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Greg Vanore 2009-08-26 11:01:57 UTC
Created attachment 24170 [details]
The zip described in the main bug. Contains test client, JSP for the server, and patch.

When using SSL + the java security manager, Tomcat leaks memory when servlets
access cert path/cipher as request properties.  When the servlet/JSP asks for
these properties and they are not in the request, they are loaded via the
JSSESupport class.

The offending method calculates the key size and attempts to cache this in the
SSLSession.  Without the security manager, this is fine.  When the security
manager runs, however, the AccessControlContext is different every time. 
Retrieving from the cache will always return null, and putting into the
SSLSession's cache leaks memory.  This part is probably a JVM bug and not
Tomcat's fault, but Tomcat is exercising it nonetheless.

Popular web service tools (such as Apache CXF) retrieve cert properties from
the request for every invocation.  Depending on the policy file, the leak may
be relatively slow, requiring 100s of thousands or millions of requests; or it
could be fast (in our case), requiring only ~40k to crash an instance of Tomcat
limited to 1 gig of memory.

A note here.  The SSLSession can be collected once it is not in use, so the
memory leak isn't permanent.  If you turn down the SSLSession timeout and wait
at least that amount of time in between using a specific session, you might
never see this bug.  You have to continuously use the same session *and* trip
JSSESession into attempting to get the keysize.  In our particular case, we
have long-connected clients that stream information to us, so it is only taking
is 10-20 minutes before we are being affected negatively by this.

I have attached a zip containing:
- A test client
- A script (.bat, sorry guys, but easily adapted) to run the test client
- A .jsp that trips the JSSESupport to call the affected method.
- The simplest possible policy file, to demonstrate it's not a policy file
issue.
Also attached: a patch that fixes this issue. (It's in the zip for good
measure.)

I recommend setting the invocation count to at least 100,000 or so on the test
client.  Use jconsole or a similar tool attached to tomcat to monitor the
memory use.  You should notice that after the test, a large chunk of memory is
not able to be returned by the garbage collector.  If you apply the patch,
update tomcat-coyote.jar, and re-run the load test, you will notice that memory
consumption stays low and you can immediately return to the baseline.
Comment 1 Greg Vanore 2009-08-26 11:02:48 UTC
Created attachment 24171 [details]
Patch for this issue.
Comment 2 Greg Vanore 2009-09-03 07:10:35 UTC
I should add, this was observed with a 64-bit Linux server JVM.  Sun has dispatched a bug to investigate whether this is a JVM/JRE problem or not.
Comment 3 Bill Barker 2009-10-25 16:56:54 UTC
I'm not a big fan of trying to work around JVM bugs, but this patch looks like it is OK (once the missing synchronizing is added).  Synchronizing will add a small hit, but this part of the code is well off the critical path.
Comment 4 Greg Vanore 2009-10-26 09:33:27 UTC
(In reply to comment #3)
> I'm not a big fan of trying to work around JVM bugs, but this patch looks like
> it is OK (once the missing synchronizing is added).  Synchronizing will add a
> small hit, but this part of the code is well off the critical path.

Yes, I agree with everything you said, except Sun has closed and concluded it is *not* a problem with the VM: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6877740 

A bit after posting this I revised my local copy of the patch to add synchronization - should I repost the patch to include it?
Comment 5 Mark Thomas 2009-12-14 06:58:30 UTC
Thanks for the report and the patch. I have applied a modified patch to trunk and proposed it for 6.0.x and 5.5.x.
Comment 6 Mark Thomas 2009-12-21 03:13:54 UTC
Patch has been applied to trunk and will be included in 6.0.21 onwards.
Comment 7 Konstantin Kolinko 2010-01-27 01:20:44 UTC
Patch was applied to 5.5 and will be in 5.5.29 onwards. Thank you.