We have 1 web server dispatching requests to 4 application servers. 2 of these servers are older/slower/under more load from other sites being hosted. When having a heavy load, we've observed that initially the busy queues of the workers are longer on the older hosts, which seems logical. However gradually increasing the load can reach a tipping point, where the older hosts can't get through the load quickly enough, making the queues grow and grow until tomcat stalls. This is happening at the same time as the two newer hosts are holding up fine and could take more load An election algorithm that takes the busy queue into account would solve the problem. One such algorithm has been submitted in #36138.
Commited #36138. It will be part of the next 1.2.16