Bug 66273 - Traffic is halted in apache when one of the worker network connection is down
Summary: Traffic is halted in apache when one of the worker network connection is down
Status: NEW
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: mod_proxy_balancer (show other bugs)
Version: 2.4.34
Hardware: PC All
: P2 normal (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-09-22 11:24 UTC by MDReddy
Modified: 2022-09-25 14:48 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description MDReddy 2022-09-22 11:24:09 UTC
In loadbalancer config file ,below two works are configured. When high load is running , one of the worker network interface is down (to check the robustness - 10.192.1.2 ). On Apache , there was not traffic is passed for almost 60 secs or more. And there were some timeout and 503 errors are client side. 

No other timeouts are set in the apache configuration. 

BalancerMember https://10.192.1.2:9443 ttl=60
BalancerMember https://10.192.1.3:9443 ttl=60

Expectation is apache should halt the traffic for 60 secs , it should divert the traffic to working worker (10.192.1.3). 

Could you please help me understanding  apache behavior when some of the workers down and is there any configuration can be used for turning the behavior. 

Let me know if any more information is needed
Comment 1 MDReddy 2022-09-22 11:48:06 UTC
In loadbalancer config file ,below two workers are configured. When high load is running , one of the worker network interface is down (to check the robustness - 10.192.1.2 ). On Apache , there was no traffic passed for almost 60 secs or more. And there were some timeouts and 503 errors are on client side. 

There is no timeout related configurations are configured in apache configuration. 

BalancerMember https://10.192.1.2:9443 ttl=60
BalancerMember https://10.192.1.3:9443 ttl=60

Expectation is apache should not halt the traffic for 60 secs , it should divert the traffic to working worker (10.192.1.3). 

Could you please help me understanding  apache behaviour when some of the workers down and is there any configuration can be used for turning the behaviour. 

Let me know if any more information is needed
Comment 2 MDReddy 2022-09-25 14:48:40 UTC
Update : Performed the same test with connectiontimeout=5 and results are little different compare to above mentioned. There was no halt seen but no of requests sent to other node is very limited. Throughput was very down for 2 or 3 minutes and later healthy worker picked up the load . Apache should send the traffic to other worker when one worker is down and there would be any delay in traffic.

proxy_hcheck_module (hcmethod=TCP hcinterval=2 hcpasses=1 hcfails=1 connectiontimeout=5) also used in the configuration and there was no different in the above behaviour. Very slow traffic was observed significant time(1 to 3 minutes ) before increasing the load on healthy worker.