Bug 48470 - Tomcat hangs while stoppping
Tomcat hangs while stoppping
Status: RESOLVED FIXED
Product: Tomcat 6
Classification: Unclassified
Component: Connectors
6.0.20
PC Windows Vista
: P2 normal (vote)
: default
Assigned To: Tomcat Developers Mailing List
:
: 47670 (view as bug list)
Depends on:
Blocks:
  Show dependency tree
 
Reported: 2010-01-01 12:56 UTC by Turks
Modified: 2010-01-13 02:30 UTC (History)
1 user (show)



Attachments
Proposed patch (6.37 KB, patch)
2010-01-11 08:10 UTC, Mark Thomas
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Turks 2010-01-01 12:56:28 UTC
Tomcat 6.0.20 running as a service on 64 bit Windows 7 machine with a quad core
processor hangs sporadically when stopping the service.

This is consistent on a variety of similar machines we have in our development
lab. Tomcat 5.5.26 is rock solid while starting and stopping the service on same platforms. Definitely introduced in Tomcat 6 at some point.

I tried a variety of JDK's and it appears that the Java Version make no difference as it still hangs while trying to stop service.

Is this possibly fixed and has not been packaged into a new build yet.

Thanks
Comment 1 Konstantin Kolinko 2010-01-01 16:07:25 UTC
Please take two or more subsequent thread dumps from a "hung" Tomcat instance. Comparing them will show what threads are stuck and where.

Here is a FAQ article:

http://wiki.apache.org/tomcat/HowTo#How_do_I_obtain_a_thread_dump_of_my_running_webapp_.3F


> Is this possibly fixed and has not been packaged into a new build yet.

The users@ list archives are searchable, if you are looking for other reports of the same problem.  I do not remember any, though.
Comment 2 Mark Thomas 2010-01-11 06:41:15 UTC
Coincidently one of our customers saw a similar issue moving from 5.5.x to 6.0.x.

I can't provide the stack traces but I can provide the analysis. It looks Tomcat is being stopped under load. In these circumstances, the connection created in unlockAccept() in the endpoint may get stuck in the TCP backlog queue. Since the connection in unlockAccept() is created without a timeout, this causes the shut down to block forever.

Tomcat 7 already has a configurable timeout for unlockAccept. I will look at porting this to Tomcat 6.
Comment 3 Mark Thomas 2010-01-11 08:10:35 UTC
Created attachment 24827 [details]
Proposed patch

This patch addresses the potential for the connector shutdown to block when Tomcat is shut down under load.

It also ensures localhost is used consistently for unlockAccept() if no specific address is provided for the connector. This should be compatible with systems that use ipv4 and/or ipv6.
Comment 4 Mark Thomas 2010-01-11 08:13:03 UTC
The attached patch has been proposed for 6.0.x

Note the 5.5.x code is quite different in this area and the reports indicate that this issue affects 6.0.x but not 5.5.x.
Comment 5 Mark Thomas 2010-01-11 09:44:06 UTC
*** Bug 47670 has been marked as a duplicate of this bug. ***
Comment 6 Konstantin Kolinko 2010-01-11 17:19:28 UTC
attachment 24827 [details] patch looks good, though I have not tried to run it yet.

+1 to add "Socket unlock completed for:" debug message to AprEndpoint, like it is done in JIoEndpoint.


Regarding s.setSoLinger(true, 0):
I see that NioEndpoint of TC6 and all endpoint implementations of TC7 use
 s.setSoLinger(getSocketProperties().getSoLingerOn(), ...)

The default value of soLingerOn is based on Constants.DEFAULT_CONNECTION_LINGER constants (in o.a.coyote.http11 or in o.a.coyote.ajp) that is -1. Thus it will be false.

I think that s.setSoLinger(true, 0) should be used in unlockAccept() for its dummy connection in all implementations of endpoint. Though I have not tested it.
Comment 7 Mark Thomas 2010-01-13 02:30:43 UTC
The fix has been applied to 6.0.x and will be included in 6.0.23 onwards.