Bug 44938

Summary: the thread http-80-SendFile-0 consumes 100% of CPU and slow tomcat responses
Product: Tomcat 6 Reporter: Clóvis Wichoski <clovis.wichoski>
Component: ConnectorsAssignee: Tomcat Developers Mailing List <dev>
Status: RESOLVED DUPLICATE    
Severity: normal    
Priority: P2    
Version: 6.0.14   
Target Milestone: default   
Hardware: PC   
OS: Linux   
Attachments: stack when the CPU is at 100%

Description Clóvis Wichoski 2008-05-05 18:37:42 UTC
Created attachment 21924 [details]
stack when the CPU is at 100%

Hi,

Tomcat version is: 6.0.14
APR: installed via RPM apr-1.2.8-6
Apache Tomcat Native library 1.1.10
APR capabilities: IPv6 [true], sendfile [true], accept filters [false], random [true].
Linux version: 2.6.23.1-10.fc7 #1 SMP i686
JVM version: Java HotSpot(TM) Server VM (build 1.5.0_15-b04, mixed mode)
VM Options: -server -Djava.net.preferIPv4Stack=true -Xmn300m -Xms1280m -Xmx1280m


i'm getting for weeks a problem, the tomcat consumes 100% of CPU for several minutes and tomcat responses is too slow and sometimes blocking, i cant figure what the problem until today, i manage to configure JConsole, to get that the thread that consumes all my CPU for this long period is the http-80-SendFile-0.

i put a tcpdump capture, that capture all traffic in production and JConsole to get exact time that the problem occurs, then on wireshark i see many TCP checksum errors, i found a way to avoid checksum errors that is explained at:

http://wiki.wireshark.org/TCP_checksum_offload

since i think that the checksum isnt the real problem, because disabling the checksum made by ethernet board, dont solved the problem, i do more research and find something about TCP Window Scale Option explained on RFC 1323, and something about a problem at http://groups.apu.edu/awg/node/101 with that problems i run the follow command at Linux console:

sysctl -w net.ipv4.tcp_window_scaling=0

whit this appears that the APR dont consumes all CPU anymore, then my question is, exists any know problem with APR and window scaling? or i'm get a wrong idea of what is the real problem?

attached is the full stack when the CPU is consumed 100% by the http-80-SendFile-0, the thread Stack is:

org.apache.tomcat.jni.Poll.poll(Native Method)
org.apache.tomcat.util.net.AprEndpoint$Sendfile.run(AprEndpoint.java:1748)
java.lang.Thread.run(Thread.java:595)

all errors "Error occurred during stack walking:" in the attached stack is:

java.lang.NullPointerException
        at sun.jvm.hotspot.runtime.Frame.addressOfStackSlot(Frame.java:214)
        at sun.jvm.hotspot.runtime.x86.X86Frame.getEntryFrameCallWrapper(X86Frame.java:452)
        at sun.jvm.hotspot.runtime.Frame.entryFrameIsFirst(Frame.java:379)
        at sun.jvm.hotspot.runtime.Frame.isFirstFrame(Frame.java:154)
        at sun.jvm.hotspot.runtime.VFrame.sender(VFrame.java:109)
        at sun.jvm.hotspot.runtime.VFrame.javaSender(VFrame.java:134)
        at sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:50)
        at sun.jvm.hotspot.tools.JStack.run(JStack.java:41)
        at sun.jvm.hotspot.tools.Tool.start(Tool.java:204)
        at sun.jvm.hotspot.tools.JStack.main(JStack.java:58)

i really dont know if this is a tomcat BUG, or a problem with Linux/Firewalls and routers, but i wish to share here what i found and how much time this consumes to get a workaround.

KR
Clóvis
Comment 1 Mark Thomas 2008-05-07 00:16:47 UTC
Given a lack of evidence to the contary, I am marking this as a duplicate.

*** This bug has been marked as a duplicate of bug 42925 ***