Bug 44571 - Limits busy per worker to a threshold
Summary: Limits busy per worker to a threshold
Alias: None
Product: Tomcat Connectors
Classification: Unclassified
Component: Common (show other bugs)
Version: unspecified
Hardware: All All
: P2 enhancement (vote)
Target Milestone: ---
Assignee: Tomcat Developers Mailing List
Depends on:
Reported: 2008-03-10 06:26 UTC by Zealot
Modified: 2015-01-07 16:22 UTC (History)
0 users

busy limit patch (7.74 KB, patch)
2008-03-10 06:28 UTC, Zealot
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Zealot 2008-03-10 06:26:55 UTC
On a high load tomcat server, If a lot of requests is received suddenly, it will take a lot of time to serve the requests, but apache still send more requests to the tomcat, then tomcat become slower and slower. Since the server takes no response, some web clients may send the request again, make tomcat server more slower. It is something like congestion.

I write a patch to limit busy per worker to a threshold. Define the threshold in worker.properties. If all workers reach the threshold, further request will get a 503 response. And sticky session parameter is ignored if the worker reaches the threshold.

It is very simple to configure, just add the busylimit to the worker. default is 0, which means no limit. eg.

worker.test-1.port = 8009
worker.test-1.host = localhost
worker.test-1.type = ajp13
worker.test-1.busylimit = 10
Comment 1 Zealot 2008-03-10 06:28:02 UTC
Created attachment 21648 [details]
busy limit patch
Comment 2 Tim Whittington 2010-01-07 14:25:45 UTC
I don't think this approach is the best way anymore.

You can use connection_pool_size to limit the business of any given worker, and (since 1.2.27) connection_acquire_timeout to specify a timeout for threads to obtain a connection when the pool is full.

The only thing that might be worth changing is the response code when an endpoint cannot be obtained for a mapped worker - Currently this produces a 500, whereas a HTTP_STATUS_SERVICE_UNAVAIL (503) would be more appropriate.
Comment 3 Rainer Jung 2015-01-05 12:34:19 UTC
I agree with the comments form Tim. The "busy" limit can and should be done with the existing feature of restricting the connection_pool_size in combination with a (short) connection_acquire_timeout.

I have committed in r1649515 a change to ensure, that if we can't get an endpoint, we always return 503. This will be part of version 1.2.41.

Closing this an FIXED, because of the 503 part.

If you see any deficit using the above suggestion, you can reopen this issue.
Comment 4 Rainer Jung 2015-01-07 16:22:11 UTC
I have changed my opinion. This can be configured using the connection_pool_size only for the ISAPI redirector. For the common mod_jk case, "busy" is a global counter, whereas the connection_pool_size is per process.

I have implemented the busy_limit very similar to your original patch in r1650098.

Some remarks:

- the busy counter is known to have been buggy in the past. I have also aplied a change to use atomics where available. Since it might still be buggy, I have flagged the busy_limit feature as experimental in the docs

- I have made it configurable on the AJP or LB member worker, not on the LB worker itself. This way you can set it to different values for individual LB members. If you want to set it to the same value for all members, simply add it to a template using the "reference" feature.

- I decided to reuse the already existing "busy" state of a worker. As a consequence there's a small change in behavior, even if you don't use "busy_limit": if a request is sticky, and there's no endpoint available, previously we nevertheless tried to process the request using the sticky worker. After this change, we will failover th request (withough marking the worker as in ERROR). Note that in the default mod_jk (Apache) case, this will never occur. It can only happen, if you have fewer worker endpoints than threades, e.g. in the ISAPI or NSAPI case.

Thanks for your contribution.