Bug 44454

Summary: busy count reported in mod_jk inflated, causes incorrect balancing
Product: Tomcat Connectors Reporter: Andrew Noonan <anoonan>
Component: CommonAssignee: Tomcat Developers Mailing List <dev>
Status: RESOLVED FIXED    
Severity: normal CC: hd3sistemas, toutatisdev, wrowe
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Attachments: Results from Thread Dump - Idle Server
Thread Dump from Apache Children
Mod_jk scheduling errors

Description Andrew Noonan 2008-02-19 13:03:33 UTC
We have some servers running Apache 2.2.3-11 on Centos 5 with mod_jk 1.2.25
connecting to tomcats running 5.5.20.  Over the course of a couple of days, the
busy count on the various servers (as reported by the jkstatus page) creeps
upward, causing the balancing to get out of whack.  A graceful restart of apache
gets things back to normal, as does a restart of the tomcat, but the process
starts again.  I'm not entirely sure what information would be helpful in
troubleshooting this, please advise.  Earlier load testing did not show
anything, so I'm guessing that specific circumstances may cause the count to
become wrong.

Thanks,
Andrew
Comment 1 Rainer Jung 2008-02-19 13:31:23 UTC
Are we sure, that the growing busy numbers are wrong?

Asked differently: can you do Thread Dumps of your backends to confirm, that
they are *not* having that many requests busy? In theory some requests might got
stuck in the backend. If you don't have a reaply_timeout, mod_jk will wait
indefinitely for a response.

By using "kill -QUIT PID" (PID is the process id of your backend=romcat process)
you will get a stack for each thread in Tomcat in your catalina.out. This
procedure does not influence your running Tomcat negatively. It will pause for a
few milliseconds and then resume the usual work.

In the dump, the threads responsible for requests coming in via AJP13 (from
mod_jk) can be identified as TP-ProcessorXXX. Some of the will only have a very
short stack, which indicates, that they are idle in the thread pool. Some have a
slightly longer stack (about 10 lines) with a top level method being a socket
read, which tells us, that mod_jk is connected to them, but they are not actualy
processing a request, and few will have a very long stack, usually containing
webapp methods, a service method etc. Those ar the ones that process a request
and should be equivalnt to "busy" (at least of Tomcat only gets requests from
one httpd. Otherwise you need to add up all the busy for this Tomcat in the
various httpd).

Altough in principal there's a chance, that busy could go wrong, I never
detected it in practice and the feature is out in the wild for quite some time
now, without problem reports. That's why I first want you to check, taht the
numbers are really wrong.

Furthermore: are there errors in the JK error log?

Last: if you want to track how and when busy increases, you can add
"%{JK_LB_LAST_BUSY}n" to your LogFormat in httpd. Then your access log will
contain the busy number after each request in the request log line.

Regards,

Rainer
Comment 2 Andrew Noonan 2008-02-19 14:00:02 UTC
Created attachment 21564 [details]
Results from Thread Dump - Idle Server

The JK Threads from an idle tomcat instance during a thread dump
Comment 3 Andrew Noonan 2008-02-19 14:08:47 UTC
I took one of our systems out of rotation and let it sit for a while.  Its busy
was hanging around 16-20 while in rotation.  Once out, it almost immediately
went to 12 and remained at 12 the entire time.  I did the thread dump and
attached the TP-ProcessorXXX entries.  It looks like the vast majority of
connections show in socket read, with a few idle in the pool.  There were no
active threads (which is good, since it was out of rotation).  The mod_jk.log
shows no errors, though there are many warnings for things like: 

[Tue Feb 19 15:56:18.849 2008] [13119:2611919760] [warn]
map_uri_to_worker::jk_uri_worker_map.c (550): Uri echo|20YYY;echo| is invalid.
Uri must start with /

Good ole' internet.  Thanks for the variable, I can see that being handy, though
I can't add anything in right now.

Let me know if there's anything more.
Comment 4 Rainer Jung 2008-02-19 14:38:02 UTC
Thanks for the check, so we know for sure, that the busy count doesn't
correspond to the backend status.

Although there is still a theoretical possibility, that mod_jk *thinks* the
backend is that busy, I checked, if we miss some error conditions when
decreasing the busy count. There is not condition in the code, how the busy
increment could not be followed by a decrement. The counter is hold inside a
shared memory segment and it is marked as volatile. In case you hae a *very*
busy web site, it's possible, that we run into a non-atomic increment/decrement
problem. Although that's possible, I'm still thinking about other possibilities.

The status worker shows a busy counter for the lb and for each member of an lb.
Do both types of busy counters suffer from the problem?

To get an idea: how fast does the problem happen (how much busy increase per day
or so, and also how many requests approximately are handled by the corresponding
worker during that time)?

You mentioned, that this influences negatively the balancing behaviour. I deduct
from that, that you are using the Busyness method. Do you have a reason not to
use the Requests method? It would not suffer from the "wrong busy" problem.

Finally: although we know no change in the code that would suggest that the
problem will go sway in 1.2.26, is it possible to switch to that version? Be
careful: in 1.2.26 your JkMounts are not automatically inhertited between httpd
VirtualHosts. You either need to include them in each VirtualHost or use
JkMountCopy (see docs). Apart from that, the version update should be reliable.

I'm still wondering, if mod_jk threads might hang between start of request and
end of response. You could use gstack, to do thread dumps on the httpd side.
Since the busy number is summed up over all httpd child processes and we don't
know if the problem is located in only one of those or visible in all of them,
you would need to do gstack on all httpd child processes.

The ones busy in mod_jk will be easily detectable, since mod_jk uses "jk" as
part of most function names.
Comment 5 Andrew Noonan 2008-02-19 15:50:35 UTC
     Well, we're a busy site, but not stunningly so... everyone's idea on what
makes a busy site may differ.  The httpd systems are never that terribly loaded
CPU/Load wise (less then 40%).  The global counter and the individual counters
add up to each over, so I believe that indicates that they both are having the
issue.  It's hard to tell at what level the inflation occurs, as without calming
servers at around the same time each day it's hard to tell what are real
requests and which are inflation.  We have a few servers in rotation that are
bulkier and run a 2x higher LBFactor, and they seem currently to have about 2x
the false busys.  I'm guessing about 5 a day for the lighter servers, and 10 a
day for the bigger servers.  Because the load balancing is unequal, it varies,
but since our graceful restart 2.5 days ago, the ACC on the servers is anywhere
from 2-4 million, depending on the server.
     As for the balancing method, the busyness method is definitely more even
when it works since our requests can be very different weights.  As for running
1.2.26, I think we could compile/use it no problem (we're running 2.2.3, the
precompile indicates 2.2.6 as a prereq), but I know management would be a little
grumbly to make a change without a better idea as to what's going on.  I don't
think the config changes are a problem for us as we use preference files and
JKMounts only at VHost level... no globals.
     I'm gathering the gstack data now and will post in a few minutes.

In case it comes up, as typical worker is configured like:
worker.<WORKER>.port=8009
worker.<WORKER>.host=<IP>
worker.<WORKER>.type=ajp13
worker.<WORKER>.socket_timeout=30
worker.<WORKER>.connection_pool_timeout=60
worker.<WORKER>.lbfactor=20

The LB involved is:
worker.<LB>.lock=P
worker.<LB>.method=B

and we have a global setting of:
worker.maintain=15

Thanks,
Andrew
Comment 6 Andrew Noonan 2008-02-19 15:54:03 UTC
Created attachment 21566 [details]
Thread Dump from Apache Children
Comment 7 Rainer Jung 2008-02-20 06:18:25 UTC
How many threads per process do you use in httpd (MPM configuration)? 60?

Anyways, I adopted my JVM Stack analysis script to gstack. The results (sums
over all 19 processes=1178 Threads) are following. In total I can see 68 threads
that would count as busy in the lb (and 2 threads that seem to do direct ajp
work without lb). So how do these numbers compare to your observed busyness at
the time of dumps?

Concerning your configuration:

Pessimistic locking for me is more a relic from the early days of the code. I
don't know about any cases, ahere it is actually needed. In case you had opened
this case with optimistic locking (default) when running out of ideas, I woulkd
have asked you to try pessimistic locking, but your case proves once more, that
it's not really useful. I would stick to the default (optimistic) in order to
run the code path that#s used most often.

You don't have a reply_timeout. The stack seem to indicate, that hanging replies
might well be your problem. Unfortunately in 1.2.25 max_reply_timeouts doesn't
work, so any reply_timeout that fires will put your worker into error state
until recovery, so other users will loose sessions when being switched over to
failover workers.

In 1.2.26 we fixed max_reply_timeouts, so that you can configure a reply_timeout
(don't wait longer for an answer than X seconds) and the number of
reply_timeouts that you expect before you think the backend itself is in serious
trouble and should be put out of service. So maybe here's the argument for a
switch to 1.2.26, although I hadn't expected that situation last night, when I
wrote about 1.2.26.

In 1.2.27 (march) we will be able to configure a reply_timeout not only per
worker, but instead per mapping, so that you can give individual JkMounts
individual reply_timeouts. So if some use cases are known to take a long time,
you can have an appropriately short general timeout, with some longer timeouts
for the URLs known to take notoriously long. It already works in 1.2.27-dev, but
there's no release out there.

Have a look at the following data yourself. I think the most reasonable approach
is to use a reply_timeout with max_reply-timeouts, which in turn will force you
to update to 1.2.26.

The details of the dumps:

548 Threads idle waiting (I would say in Keepalive,
    so hanging on a connection from the client,
    but waiting for the next request):
#0  0xHEXADDR in __kernel_vsyscall ()
#1  0xHEXADDR in poll () from /lib/libc.so.6
#2  0xHEXADDR in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#3  0xHEXADDR in apr_socket_recv () from /usr/lib/libapr-1.so.0
#4  0xHEXADDR in apr_bucket_socket_create () from /usr/lib/libaprutil-1.so.0
#5  0xHEXADDR in apr_brigade_split_line () from /usr/lib/libaprutil-1.so.0
#6  0xHEXADDR in ap_core_input_filter () from /proc/PID/exe
#7  0xHEXADDR in ap_get_brigade () from /proc/PID/exe
#8  0xHEXADDR in ap_rgetline_core () from /proc/PID/exe
#9  0xHEXADDR in ap_read_request () from /proc/PID/exe
#10 0xHEXADDR in ap_register_input_filter () from /proc/PID/exe
#11 0xHEXADDR in ap_run_process_connection () from /proc/PID/exe
#12 0xHEXADDR in ap_process_connection () from /proc/PID/exe
#13 0xHEXADDR in ap_graceful_stop_signalled () from /proc/PID/exe
#14 0xHEXADDR in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#15 0xHEXADDR in start_thread () from /lib/libpthread.so.0
#16 0xHEXADDR in clone () from /lib/libc.so.6

272 Threads idle
#0  0xHEXADDR in __kernel_vsyscall ()
#1  0xHEXADDR in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#2  0xHEXADDR in apr_thread_cond_wait () from /usr/lib/libapr-1.so.0
#3  0xHEXADDR in ap_queue_pop () from /proc/PID/exe
#4  0xHEXADDR in ap_graceful_stop_signalled () from /proc/PID/exe
#5  0xHEXADDR in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#6  0xHEXADDR in start_thread () from /lib/libpthread.so.0
#7  0xHEXADDR in clone () from /lib/libc.so.6

235 Threads lingering close (client connection shutdown)
#0  0xHEXADDR in __kernel_vsyscall ()
#1  0xHEXADDR in poll () from /lib/libc.so.6
#2  0xHEXADDR in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#3  0xHEXADDR in apr_socket_recv () from /usr/lib/libapr-1.so.0
#4  0xHEXADDR in ap_lingering_close () from /proc/PID/exe
#5  0xHEXADDR in ap_graceful_stop_signalled () from /proc/PID/exe
#6  0xHEXADDR in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#7  0xHEXADDR in start_thread () from /lib/libpthread.so.0
#8  0xHEXADDR in clone () from /lib/libc.so.6

59 Threads in mod_jk, waiting for reply from the backend (with lb?)
#0  0xHEXADDR in __kernel_vsyscall ()
#1  0xHEXADDR in read () from /lib/libpthread.so.0
#2  0xHEXADDR in jk_tcp_socket_recvfull () from /etc/httpd/modules/mod_jk.so
#3  0xHEXADDR in ajp_connection_tcp_get_message ()
#4  0xHEXADDR in ajp_get_reply () from /etc/httpd/modules/mod_jk.so
#5  0xHEXADDR in ajp_service () from /etc/httpd/modules/mod_jk.so
#6  0xHEXADDR in service () from /etc/httpd/modules/mod_jk.so
#7  0xHEXADDR in jk_handler () from /etc/httpd/modules/mod_jk.so
#8  0xHEXADDR in ap_run_handler () from /proc/PID/exe
#9  0xHEXADDR in ap_invoke_handler () from /proc/PID/exe
#10 0xHEXADDR in ap_process_request () from /proc/PID/exe
#11 0xHEXADDR in ap_register_input_filter () from /proc/PID/exe
#12 0xHEXADDR in ap_run_process_connection () from /proc/PID/exe
#13 0xHEXADDR in ap_process_connection () from /proc/PID/exe
#14 0xHEXADDR in ap_graceful_stop_signalled () from /proc/PID/exe
#15 0xHEXADDR in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#16 0xHEXADDR in start_thread () from /lib/libpthread.so.0
#17 0xHEXADDR in clone () from /lib/libc.so.6

36 Threads internal tasks (non JK, non-worker threads)

11 Threads sending non-JK content back to browser
#0  0xHEXADDR in __kernel_vsyscall ()
#1  0xHEXADDR in poll () from /lib/libc.so.6
#2  0xHEXADDR in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#3  0xHEXADDR in apr_socket_sendv () from /usr/lib/libapr-1.so.0
#4  0xHEXADDR in ap_bucket_eoc_create () from /proc/PID/exe
#5  0xHEXADDR in ap_core_output_filter () from /proc/PID/exe
#6  0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#7  0xHEXADDR in ap_http_chunk_filter () from /proc/PID/exe
#8  0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#9  0xHEXADDR in ap_http_outerror_filter () from /proc/PID/exe
#10 0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#11 0xHEXADDR in ap_content_length_filter () from /proc/PID/exe
#12 0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#13 0xHEXADDR in ?? () from /etc/httpd/modules/mod_deflate.so
#14 0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#15 0xHEXADDR in ap_old_write_filter () from /proc/PID/exe
#16 0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#17 0xHEXADDR in ap_note_auth_failure () from /proc/PID/exe
#18 0xHEXADDR in ap_process_request () from /proc/PID/exe
#19 0xHEXADDR in ap_register_input_filter () from /proc/PID/exe

6 Threads sending back jk-replies to browser
#0  0xHEXADDR in __kernel_vsyscall ()
#1  0xHEXADDR in poll () from /lib/libc.so.6
#2  0xHEXADDR in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#3  0xHEXADDR in apr_socket_sendv () from /usr/lib/libapr-1.so.0
#4  0xHEXADDR in ap_bucket_eoc_create () from /proc/PID/exe
#5  0xHEXADDR in ap_core_output_filter () from /proc/PID/exe
#6  0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#7  0xHEXADDR in ap_http_chunk_filter () from /proc/PID/exe
#8  0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#9  0xHEXADDR in ap_http_outerror_filter () from /proc/PID/exe
#10 0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#11 0xHEXADDR in ap_content_length_filter () from /proc/PID/exe
#12 0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#13 0xHEXADDR in ap_filter_flush () from /proc/PID/exe
#14 0xHEXADDR in apr_brigade_write () from /usr/lib/libaprutil-1.so.0
#15 0xHEXADDR in ap_old_write_filter () from /proc/PID/exe
#16 0xHEXADDR in ap_rwrite () from /proc/PID/exe
#17 0xHEXADDR in ws_write () from /etc/httpd/modules/mod_jk.so
#18 0xHEXADDR in ajp_get_reply () from /etc/httpd/modules/mod_jk.so
#19 0xHEXADDR in ajp_service () from /etc/httpd/modules/mod_jk.so

2 Threads waiting in mod_jk for response from backend (without lb?)
#0  0xHEXADDR in __kernel_vsyscall ()
#1  0xHEXADDR in read () from /lib/libpthread.so.0
#2  0xHEXADDR in jk_tcp_socket_recvfull () from /etc/httpd/modules/mod_jk.so
#3  0xHEXADDR in ajp_connection_tcp_get_message ()
#4  0xHEXADDR in ajp_get_reply () from /etc/httpd/modules/mod_jk.so
#5  0xHEXADDR in ajp_service () from /etc/httpd/modules/mod_jk.so
#6  0xHEXADDR in jk_handler () from /etc/httpd/modules/mod_jk.so
#7  0xHEXADDR in ap_run_handler () from /proc/PID/exe
#8  0xHEXADDR in ap_invoke_handler () from /proc/PID/exe
#9  0xHEXADDR in ap_process_request () from /proc/PID/exe
#10 0xHEXADDR in ap_register_input_filter () from /proc/PID/exe
#11 0xHEXADDR in ap_run_process_connection () from /proc/PID/exe
#12 0xHEXADDR in ap_process_connection () from /proc/PID/exe
#13 0xHEXADDR in ap_graceful_stop_signalled () from /proc/PID/exe
#14 0xHEXADDR in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#15 0xHEXADDR in start_thread () from /lib/libpthread.so.0
#16 0xHEXADDR in clone () from /lib/libc.so.6

2 Threads writing back jk-replies to client (with chunking)
#0  0xHEXADDR in __kernel_vsyscall ()
#1  0xHEXADDR in poll () from /lib/libc.so.6
#2  0xHEXADDR in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#3  0xHEXADDR in apr_socket_sendv () from /usr/lib/libapr-1.so.0
#4  0xHEXADDR in ap_bucket_eoc_create () from /proc/PID/exe
#5  0xHEXADDR in ap_core_output_filter () from /proc/PID/exe
#6  0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#7  0xHEXADDR in ap_http_outerror_filter () from /proc/PID/exe
#8  0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#9  0xHEXADDR in ap_content_length_filter () from /proc/PID/exe
#10 0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#11 0xHEXADDR in ap_filter_flush () from /proc/PID/exe
#12 0xHEXADDR in apr_brigade_write () from /usr/lib/libaprutil-1.so.0
#13 0xHEXADDR in ap_old_write_filter () from /proc/PID/exe
#14 0xHEXADDR in ap_rwrite () from /proc/PID/exe
#15 0xHEXADDR in ws_write () from /etc/httpd/modules/mod_jk.so
#16 0xHEXADDR in ajp_get_reply () from /etc/httpd/modules/mod_jk.so
#17 0xHEXADDR in ajp_service () from /etc/httpd/modules/mod_jk.so
#18 0xHEXADDR in service () from /etc/httpd/modules/mod_jk.so
#19 0xHEXADDR in jk_handler () from /etc/httpd/modules/mod_jk.so

1 Thread flushing to client in jk reply
#0  0xHEXADDR in __kernel_vsyscall ()
#1  0xHEXADDR in poll () from /lib/libc.so.6
#2  0xHEXADDR in apr_wait_for_io_or_timeout () from /usr/lib/libapr-1.so.0
#3  0xHEXADDR in apr_socket_sendv () from /usr/lib/libapr-1.so.0
#4  0xHEXADDR in ap_bucket_eoc_create () from /proc/PID/exe
#5  0xHEXADDR in ap_core_output_filter () from /proc/PID/exe
#6  0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#7  0xHEXADDR in ap_http_outerror_filter () from /proc/PID/exe
#8  0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#9  0xHEXADDR in ap_content_length_filter () from /proc/PID/exe
#10 0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#11 0xHEXADDR in ap_old_write_filter () from /proc/PID/exe
#12 0xHEXADDR in ap_pass_brigade () from /proc/PID/exe
#13 0xHEXADDR in ap_rflush () from /proc/PID/exe
#14 0xHEXADDR in ws_flush () from /etc/httpd/modules/mod_jk.so
#15 0xHEXADDR in ajp_get_reply () from /etc/httpd/modules/mod_jk.so
#16 0xHEXADDR in ajp_service () from /etc/httpd/modules/mod_jk.so
#17 0xHEXADDR in service () from /etc/httpd/modules/mod_jk.so
#18 0xHEXADDR in jk_handler () from /etc/httpd/modules/mod_jk.so
#19 0xHEXADDR in ap_run_handler () from /proc/PID/exe

5 Threads other states, non-jk related
Comment 8 Andrew Noonan 2008-02-20 09:10:53 UTC
     The busyness as reported at the time of the dump was around 300, though
given the amount of inflation or so from previous testing, 68 sounds reasonable
for actual connections at the time.  We are running 60 threads per child.  I
think it's reasonable for us to run optimistic locking, and I'll see if we can
change to that this evening.  As for 1.2.26, we're doing some compiling and
testing on that right now... any thoughts on the use of --enable-flock on the
compile?  Going to 1.2.26 will take a few days, though, as we will need to move
slowly and test things out.
Comment 9 Rainer Jung 2008-02-20 09:46:45 UTC
So unfoirtunately we now *know*, that the number is wrong, i.e. it doesn't
reflect the internal state of mod_jk. I had the hope, that 68 was already the
inflated value.

That means: all my comments on reply_timeout are true, but will most likely
*not* help about the wrong busy values.

So I'm a little clueless on how to proceed further ...

With the busy count in the access log, we could see, if the unwanted increase in
busy mostly happens during busy hours (assuming that your load is not totally
equally distributed over the day). If we really have a problem with non-atomic
increase/decrease, we should expect that to happen mostly during the peak hours.

Possible workarounds:

- don't use lbmethod busyness (yes I know, you've got good reasons to use it,
but unfortunately also one reason to not use it)
- (gracefully) restarting the web server. If you can live with the problem for a
day, you could restart the web server once every night. Caution: this will reset
all settings changed in mod_jk via the status worker during the day back to the
values configured in the config files, e.g. activations status (disable, stop).
- I could find out, if oine can easily change the busy value from the outside in
the shm file. Of course we don't know the correct busy value, so we could only
reset to 0. If we decrease the value in mod_jk and it would get below 0, we keep
it at zero. So there is some chance, that after resetting it to zero, it will
get close to the correct number after some time (we would only need a short
moment were statistically load is low).

No really nice option available at the moment ...

How many threads per process do you have configured for the MPM? 60?

No advise on enable-flock.

Do you think you could reproduce with a simple test scenario? This would open up
the opportunity to test patches. As I said, we can't reproduce and I don't know
of any other such case.
Comment 10 Andrew Noonan 2008-02-20 10:28:28 UTC
Yeah, I did a bunch of load testing against these systems when we moved to 2.2
and 1.2.25 (we were using 2.0 and an internally hacked version of 1.2.<something
old, perhaps 12>) where I tossed a ton of requests at them to the tune of 5000
req/sec through JK with a couple of servers on the backend that soaked up the
dummy requests with a random delay.  These tests would run for over a day and
the result would have the busy count going back down to 0.  I suspect that this
is event driven, and that the load relationship exists simply because higher
loaded systems are more likely to experience the event.  Is there a case that
you can think of where the busy counter is incremented, but the thread is
aborted?  Is there JK logging that could be turned on for a while that would log
the increment and decrement of the counter with requests?  This way they could
be paired up and the specific requests not being decremented would be identified
and could be examined for issues.  It seems like the access log results would
not let us know this info because it could show the overall state at the
completion of the request, and not if the thread actually incremented and
decremented the counters.  I don't really know what I'm talking about, here, so
let me know if something like that is possible or not very practical.

Thanks,
Andrew
Comment 11 Rainer Jung 2008-02-20 15:47:46 UTC
Before I start to think deeper about your suggestion: could it be related 
to "apachectl restart" or "apachectl graceful"? Or would you say that that 
doesn't fit what you see?

Second possibility: are we sure, that you don't experience any process crashes 
(segfaults et.al.)? Normaly those should get logged in the usual error log of 
httpd.
Comment 12 Andrew Noonan 2008-02-20 16:01:33 UTC
Well, I'm not 100% sure what you mean by related to the apache graceful/restart.
 When we do a graceful (don't really do restarts because of the live nature),
the counts do seem to reset, so I don't *think* there's a problem there.  I
don't see any segfaults in the log, just junk like "request failed: error
reading the headers" from crappy bots and script kiddies.  Also, I just started
looking through the code and it seems like the difference between the optimistic
locking and pessimistic locking is just that the pessimistic locking just does
the same locking/unlocking sequence in more places, is that right?  I'm not a C
programmer by any stretch of the imagination.
Comment 13 Rainer Jung 2008-02-20 16:36:24 UTC
optimistic/pessimistic: that's right. As I said, switching to ptimistic will
most likely not help with this issue, but I would expect less risk for other issues.

graceful: I was thinking about graceful restarts, which let threads end their
request, but thought they might not be able to decrease the busy counter in the
shared memory any more. As you said, that does not seem to be the reason for
your problem.

Another comment: the busy count is not really what's used by the Busyness
method. We still use what's called "V" in the status output. It should be the
same as busyness, in case all load factors are 1, and otherwise some bigger
numbers indicating weighted busyness. As you observe problems in the balancing,
I expect the "V" values to get inflated as well. Right?

The factor by which V differs from busy is shown as multiplicity "M" in the
status output. Thos are integers which have the same ratios for the different
workers as 1/(load factor).
Comment 14 Andrew Noonan 2008-02-21 08:45:23 UTC
Yeah, the relationship between M, V, and the Busy is correct (which is to say
that V is inflated).
Comment 15 Paul Martin 2008-04-14 06:25:52 UTC
We've noticed this problem in our production environment as well...  At present it is a low-load point in the day (when we are processing ~20 requests), yet the busy count is reported as being 400+ (which is about MAX - 20 in jk-status).  

Nothing short of restarts seem to correct the count.

I've literally tried all the timeout values I could find (on both ends - mod_jk and jboss) - keeping of course in mind their intended purposes as described in the docs - but none seemed to have any effect on the busy count not decrementing.

I did a jboss thread dump and found 146 threads with the word "8009" in it, but according to Rainer's analysis only a handful (11 to be exact - I grepped RUNNABLE threads only) of them are actually processing requests.

Curiously enough, we did not seem to have this issue in our test environment (with nearly identical configurations), even after intensive load testing.

I'm not sure if this is something specific to our environment.  For the record:

SunOS mega 5.10 Generic_127111-06 sun4v sparc SUNW,Sun-Fire-T200

Server Version: Apache/2.2.6 (Unix) mod_jk/1.2.25
JK Version:     mod_jk/1.2.25

Server version: Apache/2.2.6 (Unix)
Server built:   Jan 19 2008 22:12:59
Server's Module Magic Number: 20051115:5
Server loaded:  APR 1.2.11, APR-Util 1.2.10
Compiled using: APR 1.2.11, APR-Util 1.2.10
Architecture:   32-bit
Server MPM:     Worker
  threaded:     yes (fixed thread count)
    forked:     yes (variable process count)
Server compiled with....
 -D APACHE_MPM_DIR="server/mpm/worker"
 -D APR_HAS_SENDFILE
 -D APR_HAS_MMAP
 -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)
 -D APR_USE_FCNTL_SERIALIZE
 -D APR_USE_PTHREAD_SERIALIZE
 -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT
 -D APR_HAS_OTHER_CHILD
 -D AP_HAVE_RELIABLE_PIPED_LOGS
 -D DYNAMIC_MODULE_LIMIT=128
 -D HTTPD_ROOT="/usr/local/httpd-cls01"
 -D SUEXEC_BIN="/usr/local/httpd-cls01/bin/suexec"
 -D DEFAULT_SCOREBOARD="logs/apache_runtime_status"
 -D DEFAULT_ERRORLOG="logs/error_log"
 -D AP_TYPES_CONFIG_FILE="conf/mime.types"
 -D SERVER_CONFIG_FILE="conf/httpd.conf"


We are currently running in degraded mode due to a session replication issue, so for the time being only one node is responding to requests.  Perhaps this is part of the equation?
Comment 16 Paul Martin 2008-04-14 06:29:06 UTC
I forgot to mention the rate of the leak... we last did a restart about 48 hours ago.  We seem to be incrementing the busy count by about 200 per day.

(In reply to comment #15)
> We've noticed this problem in our production environment as well...  At present
> it is a low-load point in the day (when we are processing ~20 requests), yet
> the busy count is reported as being 400+ (which is about MAX - 20 in
> jk-status).  
> 
> Nothing short of restarts seem to correct the count.
> 
> I've literally tried all the timeout values I could find (on both ends - mod_jk
> and jboss) - keeping of course in mind their intended purposes as described in
> the docs - but none seemed to have any effect on the busy count not
> decrementing.
> 
> I did a jboss thread dump and found 146 threads with the word "8009" in it, but
> according to Rainer's analysis only a handful (11 to be exact - I grepped
> RUNNABLE threads only) of them are actually processing requests.
> 
> Curiously enough, we did not seem to have this issue in our test environment
> (with nearly identical configurations), even after intensive load testing.
> 
> I'm not sure if this is something specific to our environment.  For the record:
> 
> SunOS mega 5.10 Generic_127111-06 sun4v sparc SUNW,Sun-Fire-T200
> 
> Server Version: Apache/2.2.6 (Unix) mod_jk/1.2.25
> JK Version:     mod_jk/1.2.25
> 
> Server version: Apache/2.2.6 (Unix)
> Server built:   Jan 19 2008 22:12:59
> Server's Module Magic Number: 20051115:5
> Server loaded:  APR 1.2.11, APR-Util 1.2.10
> Compiled using: APR 1.2.11, APR-Util 1.2.10
> Architecture:   32-bit
> Server MPM:     Worker
>   threaded:     yes (fixed thread count)
>     forked:     yes (variable process count)
> Server compiled with....
>  -D APACHE_MPM_DIR="server/mpm/worker"
>  -D APR_HAS_SENDFILE
>  -D APR_HAS_MMAP
>  -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)
>  -D APR_USE_FCNTL_SERIALIZE
>  -D APR_USE_PTHREAD_SERIALIZE
>  -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT
>  -D APR_HAS_OTHER_CHILD
>  -D AP_HAVE_RELIABLE_PIPED_LOGS
>  -D DYNAMIC_MODULE_LIMIT=128
>  -D HTTPD_ROOT="/usr/local/httpd-cls01"
>  -D SUEXEC_BIN="/usr/local/httpd-cls01/bin/suexec"
>  -D DEFAULT_SCOREBOARD="logs/apache_runtime_status"
>  -D DEFAULT_ERRORLOG="logs/error_log"
>  -D AP_TYPES_CONFIG_FILE="conf/mime.types"
>  -D SERVER_CONFIG_FILE="conf/httpd.conf"
> 
> 
> We are currently running in degraded mode due to a session replication issue,
> so for the time being only one node is responding to requests.  Perhaps this is
> part of the equation?
> 

Comment 17 Mark Greene 2008-09-30 08:19:21 UTC
Hi,

I wanted to know if there was any intent on fixing this issue as I have experienced EXACTLY the same symptoms on my production environment and unable to reproduce them in a test environment by simply throwing a load tester at it.

I'm using:
Server Version:	Apache/2.0.52 (CentOS)
JK Version:	mod_jk/1.2.26
JBoss Version: 4.2.2 GA

I have also set about every timeout I could find in the workers.properties documentation as well as the timeout on the AJP connector inside tomcat. 

I was incredibly happy to see this bug filed as every time I see a forum post with the same issue, someone is quick to respond with telling the poster to set all these timeout settings.

Thanks.
Comment 18 amit.x.yadav 2009-06-24 21:12:53 UTC
I too have been having this exact same issue in my PROD environment. I am using version 1.2.27 of the mod_jk and running IHS on Solaris. The symptoms are identical to the ones described in this bug. We have spent a good part of the last 3 weeks trying to address the issue of the inflated Busy count. 

Is there any plans to resolve this issue once and for all?

Thank you
Comment 19 William A. Rowe Jr. 2010-06-21 22:44:52 UTC
Studying this at length; the locking mechanism is not the issue.  fcntl() 
is likely the most reliable (not fastest) and flock(), which is not used by
default, would be far less reliable as it isn't supported by nas solutions.

The faulty assumption that 'volatile' magically enables atomic math operations
on smb environments is the issue.  'int64' is not atomic on any platform, and
it would appear that 'int' increments are acting appropriately for affected users,
while their decrements are failing.

All operations on the scoreboard must be mutexed.
Comment 20 Rainer Jung 2010-06-22 03:26:19 UTC
Thanks for looking into this Bill!

Should 32Bit operations be atomic? The busy counter could well be 32 Bits.

For most of the other pure statistics counters, like bytes transferred, total request number etc. I'm not to concerned about correctness of the numbers, but the busy count influences the work of the balancer.

Regards,

Rainer
Comment 21 William A. Rowe Jr. 2010-06-22 03:44:22 UTC
Well, we are already misusing the phrase optlock vs real locking, so perhaps an
additional pedantic mode that always locks?

32 bit math is not smp-safe on most architectures.  There was an apr_atomic
interface that illustrated that, if you would like to refer to the apr project
sources.
Comment 22 Sebb 2010-06-22 07:26:05 UTC
At the risk of stating the obvious:

32 bit reads and writes are guaranteed atomic by the JVM.

However, an arithmetic operation (e.g. increment) requires two operations:  read and then write, and another thread can write the field between the two atomic operations.

64 bit reads and writes are not even guaranteed atomic. This was to allow for systems that did not have 64 bit operations, which therefore had to perform 2 off 32-bit operations. (Note: they probably are atomic on most modern systems, depending on the alignment).

Adding volatile makes 64 bit reads and writes atomic, but does not protect multiple operations such as increment.
Comment 23 Mark Thomas 2010-06-22 07:29:39 UTC
(In reply to comment #22)
> At the risk of stating the obvious:

Interesting but irrelevant. This is httpd module code, not Java code.
Comment 24 Sebb 2010-06-22 08:00:29 UTC
Oops, sorry! Though a JVM is not involved here, increment still requires two separate memory operations.

BTW, shared memory can behave very unexpectedly - e.g. writes by one CPU may appear out of order to another CPU unless appropriate instructions (memory barriers) are used. This is in addition to the problems caused by interleaved reads/writes, which can affect arithmetic operations potentially even on a single CPU system.
Comment 25 Joe Kislo 2010-10-25 13:00:50 UTC
(In reply to comment #21)
> Well, we are already misusing the phrase optlock vs real locking, so perhaps an
> additional pedantic mode that always locks?
> 
> 32 bit math is not smp-safe on most architectures.  There was an apr_atomic
> interface that illustrated that, if you would like to refer to the apr project
> sources.

We are experiencing this problem with ModJK 1.2.28 and Tomcat 6.0.26 on Ubuntu Lucid 64bit.  I can confirm this is still very much an issue.  

When we graceful apache, the counter seems to reset back to 0... however an apache graceful sets the counter back to 0 even if there *ARE* active connections (this is probably expected).  When those active connections finish up, the busy counter does not go negative, it stays at zero (which is correct).

Are there plans on fixing this issue?  We have built into our tomcat init.d wrapper the ability to gracefully bounce a JVM without interrupting or loosing any pending apache requests, or requests that come in while the JVM is restarting.  Unfortunately this bug makes this impossible because we think there are active pending requests, when there are not.
Comment 26 feizheng 2012-05-21 06:57:45 UTC
Created attachment 28807 [details]
Mod_jk scheduling errors
Comment 27 Paul Martin 2012-05-21 15:02:23 UTC
For anyone having this issue, you should know that mod_proxy can essentially be used as a functional replacement to mod_jk.

While mod_proxy is theoretically less efficient than mod_jk, we found that the overhead added was minimal and of essentially no consequence as the apache-tomcat bridge was far from being the bottleneck.

I had worked around this issue by moving to mod_proxy - if this avenue is available to you, it is worth exploring.
Comment 28 Rainer Jung 2015-01-03 09:54:25 UTC
I added a warning to the docs in r1649182.

The only real solution would be the introduction of atomics to "sync" increments and decrements of the busy counter. We could use APR atomics for mod_jk and would have to look for Windows atomics for IIS. That would leave NSAPI open. Since it seems to be used only very infrequently, we could leave the problem as-is for NSAPI.
Comment 29 Rainer Jung 2015-01-04 09:59:18 UTC
I have now changed the implementation of the busy counter to use atomics (see r1649306).

This is based on the builtin atomics on Windows, gcc atomics if gcc is used or APR atomics if APR is available. In other cases, still no atomics are used.

Some smaller problems might still exist, e.g. when doing a web server restart it could be that the busy number might not be correct. I wasn't able to reproduce such behavior, but it might happen. Most of the cases for skewed numbers should be fixed by the atomics, so I'm closing this ticket now.
Comment 30 Christopher Schultz 2015-01-06 19:40:07 UTC
Should we add a WARN-level log message if the user requests "busyness" and the binary does not support atomic busyness-tracking?

I think that's a reasonable warning to issue. We could just emit that log on startup... no need to spew logs repeatedly.
Comment 31 Rainer Jung 2015-01-07 11:56:04 UTC
Warning added in r1650043:

[Wed Jan 07 12:53:10.882 2015] [6976:1] [warn] init::jk_lb_worker.c (1908): Missing support for atomics: LB method 'busyness' not recommended