Tomcat 6.0.20 running as a service on 64 bit Windows 7 machine with a quad core processor hangs sporadically when stopping the service. This is consistent on a variety of similar machines we have in our development lab. Tomcat 5.5.26 is rock solid while starting and stopping the service on same platforms. Definitely introduced in Tomcat 6 at some point. I tried a variety of JDK's and it appears that the Java Version make no difference as it still hangs while trying to stop service. Is this possibly fixed and has not been packaged into a new build yet. Thanks
Please take two or more subsequent thread dumps from a "hung" Tomcat instance. Comparing them will show what threads are stuck and where. Here is a FAQ article: http://wiki.apache.org/tomcat/HowTo#How_do_I_obtain_a_thread_dump_of_my_running_webapp_.3F > Is this possibly fixed and has not been packaged into a new build yet. The users@ list archives are searchable, if you are looking for other reports of the same problem. I do not remember any, though.
Coincidently one of our customers saw a similar issue moving from 5.5.x to 6.0.x. I can't provide the stack traces but I can provide the analysis. It looks Tomcat is being stopped under load. In these circumstances, the connection created in unlockAccept() in the endpoint may get stuck in the TCP backlog queue. Since the connection in unlockAccept() is created without a timeout, this causes the shut down to block forever. Tomcat 7 already has a configurable timeout for unlockAccept. I will look at porting this to Tomcat 6.
Created attachment 24827 [details] Proposed patch This patch addresses the potential for the connector shutdown to block when Tomcat is shut down under load. It also ensures localhost is used consistently for unlockAccept() if no specific address is provided for the connector. This should be compatible with systems that use ipv4 and/or ipv6.
The attached patch has been proposed for 6.0.x Note the 5.5.x code is quite different in this area and the reports indicate that this issue affects 6.0.x but not 5.5.x.
*** Bug 47670 has been marked as a duplicate of this bug. ***
attachment 24827 [details] patch looks good, though I have not tried to run it yet. +1 to add "Socket unlock completed for:" debug message to AprEndpoint, like it is done in JIoEndpoint. Regarding s.setSoLinger(true, 0): I see that NioEndpoint of TC6 and all endpoint implementations of TC7 use s.setSoLinger(getSocketProperties().getSoLingerOn(), ...) The default value of soLingerOn is based on Constants.DEFAULT_CONNECTION_LINGER constants (in o.a.coyote.http11 or in o.a.coyote.ajp) that is -1. Thus it will be false. I think that s.setSoLinger(true, 0) should be used in unlockAccept() for its dummy connection in all implementations of endpoint. Though I have not tested it.
The fix has been applied to 6.0.x and will be included in 6.0.23 onwards.