Bug 56213

Summary: Many KeyReference Objects Cause Long CMS-remark GC Times
Product: Tomcat 7 Reporter: Jeff Middleton <jmiddav>
Component: ConnectorsAssignee: Tomcat Developers Mailing List <dev>
Status: RESOLVED FIXED    
Severity: normal    
Priority: P2    
Version: 7.0.47   
Target Milestone: ---   
Hardware: PC   
OS: Linux   

Description Jeff Middleton 2014-03-03 22:48:06 UTC
This is a request to prop the following fix to Tomcat 7.

http://svn.apache.org/viewvc?diff_format=h&view=revision&revision=1388890

Using the NIO connector with a WebSocket application many (millions of) KeyReference objects are  created under traffic.  While the objects are small, the sheer number of them seems to overwhelm the CMS collector.  Before the above fix with our application the CMS remark stop the world phase was taking up to 8.8 seconds.

716.463: [GC[YG occupancy: 213377 K (392896 K)]716.463: [Rescan (parallel) , 5.7904380 secs]722.253: [weak refs processing, 2.3101000 secs] [1 CMS-remark: 1160677K(1703936K)] 1374054K(2096832K), 8.8005960 secs] [Times: user=107.17 sys=0.00, real=8.80 secs] 
Total time for which application threads were stopped: 8.8020690 seconds

After applying the fix under the same traffic the time drops to 110ms.

1282.459: [GC[YG occupancy: 203760 K (392896 K)]1282.459: [Rescan (parallel) , 0.0594610 secs]1282.519: [weak refs processing, 0.0452530 secs] [1 CMS-remark: 1159948K(1703936K)] 1363708K(2096832K), 0.1084940 secs] [Times: user=1.12 sys=0.00, real=0.11 secs] 
Total time for which application threads were stopped: 0.1100650 seconds

For a real time application the 9 second pause is a killer.

These tests were executed under:
RHEL 6.4
java version "1.6.0_65"
Java(TM) SE Runtime Environment (build 1.6.0_65-b14-BR8013809)
Java HotSpot(TM) 64-Bit Server VM (build 20.65-b04, mixed mode)

With the following JVM parms:
 -server
 -XX:NewSize=384m
 -XX:MaxNewSize=384m
 -XX:+UseParNewGC
 -XX:MaxTenuringThreshold=0
 -Xms2048m
 -Xmx2048m
 -XX:+UseConcMarkSweepGC
 -XX:CMSInitiatingOccupancyFraction=68
 -XX:+UseCMSInitiatingOccupancyOnly
 -XX:PermSize=96m
 -XX:MaxPermSize=96m
 -XX:+UseMembar
 -XX:+HeapDumpOnOutOfMemoryError
 -XX:+PrintClassHistogram
Comment 1 Mark Thomas 2014-03-05 13:42:20 UTC
This has been fixed in 7.0.x and will be included in 7.0.53 onwards.
Comment 2 Konstantin Kolinko 2014-03-08 18:44:41 UTC
Regarding this change ( r1574484 ) I would like to hear, whether it really improves situation for you.

KeyReference is just a reference object. Reusing and recycling it via a queue creates another reference object that holds it in the queue (ConcurrentLinkedQueue$Node, in queue.offer()). These Node objects are created on each offer and are not reused.

So, amount of garbage is exactly the same, but some processing cost is added.

My guess is that Node objects are more short-lived and thus are collected more easily. Also they do not have a finalize() method and can be disposed of in one step without going through a finalizer queue. So the only difference is in the kind of garbage.

Is it worth it?
Comment 3 Jeff Middleton 2014-03-18 23:16:03 UTC
Thanks for the quick turnaround!

I see what you are saying about the reference objects, but yes this change made a huge difference in gc times.   The times quoted in the description (100ms vs 8.8 secs) were with and without that patch.  Maybe it is the finalization method that is slowing things down.  Just did some quick reading and it appears to be executed during that phase.