Bug 725

Summary: Tomcat deadlocks under extreme system load
Product: Tomcat 3 Reporter: Peter <peter.mezey>
Component: ConnectorsAssignee: Tomcat Developers Mailing List <dev>
Status: CLOSED LATER    
Severity: normal    
Priority: P1    
Version: 3.2.1 Final   
Target Milestone: ---   
Hardware: Sun   
OS: Solaris   

Description Peter 2001-02-26 17:35:24 UTC
Problem appears when a Verity search engine spider is launched against a 
document bundle consisting of approximately 5,000 jsp pages (each page has many 
links to the other pages in the bundle).  The use of actual jsp code is light.  
Each page only has a <jsp:include ... tag in it to get a consistent look and 
feel.  As the vspider program slams requests at Apache/Tomcat, you can watch the 
memory usage slowly rise using "top".  Once it gets to about 170MB in SIZE, the 
system goes into a 100% iowait state and Tomcat no longer responds to any 
requests.  I have slowed the vspider down to use three request threads and to 
pause 2 seconds between each request.  At that rate, it can take from 30 - 60 
minutes or longer before this happens (that works out to roughly 2,700 - 5,400 
requests to Tomcat).  Sometimes, I will see a stack trace, but not always:

java.net.SocketException: Connection reset by peer: Connection reset by peer
        at java.net.SocketInputStream.socketRead(Native Method)
        at java.net.SocketInputStream.socketRead(Compiled Code)
        at java.net.SocketInputStream.read(Compiled Code)
        at 
org.apache.tomcat.service.connector.TcpConnector.receiveFully(Compiled Code)
        at org.apache.tomcat.service.connector.TcpConnector.receive(Compiled 
Code)
        at 
org.apache.tomcat.service.connector.Ajp13ConnectionHandler.processConnection(Com
piled Code)
        at org.apache.tomcat.service.TcpWorkerThread.runIt(Compiled Code)
        at org.apache.tomcat.util.ThreadPool$ControlRunnable.run(Compiled Code)
        at java.lang.Thread.run(Compiled Code)

The platform we have is the following:

uname -a reports:
SunOS urnge 5.6 Generic_105181-20 sun4u sparc SUNW,Ultra-2

JDK version is 1.2.2_07 (I have the same problem with 1.3.0)
Apache version is 1.3.12
Tomcat version is 3.2.1
I am using ajp13 (I have the same problem with ajp12)
The ajp13 setup in server.xml looks like this:

<Connector className="org.apache.tomcat.service.PoolTcpConnector">
            <Parameter name="handler"
       value="org.apache.tomcat.service.connector.Ajp13ConnectionHandler"/>
            <Parameter name="port" value="8009"/>
            <Parameter name="max_threads" value="200"/>
            <Parameter name="max_spare_threads" value="50"/>
            <Parameter name="min_spare_threads" value="20"/>
        </Connector>

If there is any other information I can provide do not hesitate to ask.
Comment 1 Peter 2001-02-26 17:38:07 UTC
*** Bug 724 has been marked as a duplicate of this bug. ***
Comment 2 Marc Saegesser 2001-03-20 12:49:01 UTC
This may be related to bug 1006.
Comment 3 Marc Saegesser 2001-03-21 13:00:37 UTC
This may be solved by the fix for bug 1006 which will appear in 3.2.2b2.  I'll 
leave this bug report open incase it isn't fixed so that it be investigated 
further in a later release.
Comment 4 Costin Manolache 2001-04-22 18:07:25 UTC
The most important question is if the JSP pages are using sessions ( the default
is true ). That results in memory growing ( and there's nothing to do about
it - except maybe limitting the number of allowed sessions ). 
It would be very usefull to try the same thing using "ab" from apache 
( apache bench ) with only one of the JSPs. 

 
Comment 5 Peter 2001-05-03 11:05:09 UTC
Thanks Costin for your tip on sessions.  I regenerated the whole document tree 
using the <%@ page session="false" %> directive for each and re-ran the verity 
spider program.  I am also using Tomcat 3.2.2b2 now.  It took much longer to get 
stuck, but it did get stuck once again.  At first I thought the 3.2.2b2 had 
solved the problem because I was able to run through a whole document tree.  
However, repeated efforts to duplicate the success show that 3.2.2b2 does seem 
to last longer, but still eventually winds up with a full 100% iowait state 
condition - given a large enough document tree to index.

One other thing I've noticed is that the "SIZE" of tomcat as reported by top 
approached the maximum physical amount of memory on the box (256MB on the box, 
approx 227MB for tomcat).  I did not see any memory exceptions reported by 
tomcat, but is it possible that the GC is constantly going off trying to get 
more memory, it's unsuccessful, and instead of throwing an exception, it just 
runs again?  I did not make note of the swap space left...I'll have to re-do 
this and check that as well.  One other piece of information is that the 
mod_jk.log file is full of errors like the following:

[jk_ajp13_worker.c (619)]: Error reading request
[jk_ajp13_worker.c (619)]: Error reading request
[jk_ajp13_worker.c (203)]: connection_tcp_get_message: Error - jk_tcp_socket_rec
vfull failed
[jk_ajp13_worker.c (619)]: Error reading request
Comment 6 Costin Manolache 2002-10-02 18:45:40 UTC
I'll close this bug. It should have been fixed in jk1.2 and jk2 - please verify
and if it's still a problem reopen the bug.