Bug 55152 - graceful restart adjusts to new balancer members, but load balancing does not; fails over to first in list
Summary: graceful restart adjusts to new balancer members, but load balancing does not...
Status: RESOLVED LATER
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: mod_proxy_balancer (show other bugs)
Version: 2.2.20
Hardware: Macintosh All
: P2 normal (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords: MassUpdate
Depends on:
Blocks:
 
Reported: 2013-06-28 01:37 UTC by john gale
Modified: 2018-11-07 21:08 UTC (History)
0 users



Attachments
Added change can fix this issue (1.26 KB, text/plain)
2014-04-15 06:22 UTC, chillmein11
Details

Note You need to log in before you can comment on or make changes to this bug.
Description john gale 2013-06-28 01:37:36 UTC
Using a load balancer to split traffic roughly evenly between a few servers, I have a conf block like this:

      ProxyPass /munin balancer://balancer-group
      ProxyPassReverse /munin balancer://balancer-group
      ProxyTimeout 3600
      <Proxy "balancer://balancer-group">
         BalancerMember http://10.0.1.1:1234/munin route=munin1
         BalancerMember http://10.0.1.2:1234/munin route=munin2
         BalancerMember http://10.0.1.3:1234/munin route=munin3
#        BalancerMember http://10.0.1.4:1234/munin route=munin4
      </Proxy>

To find out which worker route the query is taking I have adjusted the log to be this:

   CustomLog "/var/log/apache2/access_log" "%h direct port %p fwd %{BALANCER_WORKER_ROUTE}e %l %u %t \"%r\" %>s %b %D micsec"

In attempting to add, remove, and maintain the members, I comment out some of them and uncomment as I bring them offline and online.  After adjusting this conf block I gracefully restart the server with a kill -USR1.

I have noticed that although a graceful restart succeeds in blocking the flow of traffic to a commented out server, sometimes I got into a situation where apache would direct traffic only to the first server in the list, ignoring all the other supposedly live members.  When this occurred, the access log would mention it's attempting to send traffic to "munin4", which was commented out. Likely it then would default to the first member in the list.

After a full (not graceful) restart of httpd, it succeeded in balancing traffic between all live members.

It could be a bug that a graceful restart does not successfully restart the load balancing algorithm, even though it reloads the member configuration settings. Or if this is intentional, it does not seem to be called out in the documentation, and causes angst among those trying to debug their setups.
Comment 1 john gale 2013-06-28 01:40:22 UTC
Possibly related to another "graceful doesn't fully reset" bug, https://issues.apache.org/bugzilla/show_bug.cgi?id=49771
Comment 2 Michael Bittorf 2014-03-26 13:32:56 UTC
Sounds like it's related to https://issues.apache.org/bugzilla/show_bug.cgi?id=44736
Comment 3 Eric Garreau 2014-03-28 09:09:04 UTC
LB problems are so use-case dependent that this one might also look like  https://issues.apache.org/bugzilla/show_bug.cgi?id=56261
Comment 4 chillmein11 2014-04-15 06:22:31 UTC
Created attachment 31525 [details]
Added change can fix this issue

after graceful restart lbstatus and lbfactor (0 )is not initialized for added balancers.
If lbfactor is 0 initialize it to 1. 

I have tested with this and load balancing for newly added servers are working for me, but not sure about side effects.
Comment 5 William A. Rowe Jr. 2018-11-07 21:08:11 UTC
Please help us to refine our list of open and current defects; this is a mass update of old and inactive Bugzilla reports which reflect user error, already resolved defects, and still-existing defects in httpd.

As repeatedly announced, the Apache HTTP Server Project has discontinued all development and patch review of the 2.2.x series of releases. The final release 2.2.34 was published in July 2017, and no further evaluation of bug reports or security risks will be considered or published for 2.2.x releases. All reports older than 2.4.x have been updated to status RESOLVED/LATER; no further action is expected unless the report still applies to a current version of httpd.

If your report represented a question or confusion about how to use an httpd feature, an unexpected server behavior, problems building or installing httpd, or working with an external component (a third party module, browser etc.) we ask you to start by bringing your question to the User Support and Discussion mailing list, see [https://httpd.apache.org/lists.html#http-users] for details. Include a link to this Bugzilla report for completeness with your question.

If your report was clearly a defect in httpd or a feature request, we ask that you retest using a modern httpd release (2.4.33 or later) released in the past year. If it can be reproduced, please reopen this bug and change the Version field above to the httpd version you have reconfirmed with.

Your help in identifying defects or enhancements still applicable to the current httpd server software release is greatly appreciated.