Bug 44736 - mod_proxy_balancer looses it's mind on reloads.
mod_proxy_balancer looses it's mind on reloads.
Status: RESOLVED FIXED
Product: Apache httpd-2
Classification: Unclassified
Component: mod_proxy_balancer
2.2.8
PC Linux
: P2 critical with 50 votes (vote)
: ---
Assigned To: Apache HTTPD Bugs Mailing List
: PatchAvailable
: 42621 45950 (view as bug list)
Depends on:
Blocks:
  Show dependency tree
 
Reported: 2008-04-01 17:22 UTC by Tautvydas
Modified: 2015-07-16 07:39 UTC (History)
14 users (show)



Attachments
keepalive.py - Python script to keep an apache process alive indefinitely by using the keepalive issue (544 bytes, text/x-python)
2011-01-26 20:03 UTC, Andrew
Details
proposed patch (3.41 KB, patch)
2014-10-07 11:24 UTC, jkaluza
Details | Diff
proposed patch v2 (3.42 KB, patch)
2014-10-07 11:27 UTC, jkaluza
Details | Diff
proposed patch v3 (3.98 KB, patch)
2014-10-08 11:39 UTC, jkaluza
Details | Diff
proposed patch v4 (3.54 KB, patch)
2014-10-08 12:25 UTC, jkaluza
Details | Diff
proposed patch v5 (3.65 KB, patch)
2014-10-09 06:35 UTC, jkaluza
Details | Diff
proposed patch v6 (5.93 KB, patch)
2014-10-09 11:56 UTC, Yann Ylavic
Details | Diff
proposed patch v7 (8.87 KB, patch)
2014-10-28 23:35 UTC, Yann Ylavic
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Tautvydas 2008-04-01 17:22:50 UTC
I have in my config:

<Virtualhost 10.10.11.108:443>
        ServerName test
        DocumentRoot "/var/www/test/"
        ErrorDocument 500 "/error/50x.php"
        ErrorDocument 502 "/error/50x.php"
        ErrorDocument 503 "/error/50x.php"
        ErrorDocument 403 "/error/403.html"
        ErrorDocument 404 "/error/404.html"
        ProxyPass /error/ !
        ProxyPass /lb/ !
        RewriteEngine On
        RewriteRule ^/$ /test/ [R]
<Location /lb>
        SetHandler balancer-manager
</Location>
        ProxyTimeout 3000
        KeepAlive Off
        ProxyPass / balancer://test/ stickysession=JSESSIONID nofailover=On
        ProxyPassReverse / http://10.10.20.51:8080/
        ProxyPassReverse / http://10.10.21.51:8080/
        <Proxy balancer://dashboard/>
                BalancerMember http://10.10.20.51:8080 route=node1 retry=0
                BalancerMember http://10.10.21.51:8080 route=node2 retry=0
        </Proxy>
        ProxyPreserveHost On
        ProxyErrorOverride On
        SSLEngine on
        SSLCertificateFile /etc/httpd/certs/_.localhost.com.crt
        SSLCertificateKeyFile /etc/httpd/certs/_.localhost.com.key
        SSLCertificateChainFile /etc/httpd/certs/some_intermediate.crt
#       LogLevel debug
        ErrorLog /var/log/httpd/test.error.log
        CustomLog /var/log/httpd/test.access.log combined
</Virtualhost>


When apache 2.2.8 starts everything works fine and if I go to https://10.10.11.108/lb/

I see something like this:

StickySession	Timeout	FailoverAttempts	Method
JSESSIONID	0	1 	byrequests

Worker URL	Route	RouteRedir	Factor	Set	Status	Elected	To	From
http://10.10.20.51:8080	node1		1	0	Ok	19013	19M	143M
http://10.10.21.51:8080	node2		1	0	Ok	1602	752K	5.1M

However sometimes if i modify other virtualhosts and do:

# apachectl graceful

Load balancer looses stickyness, and if i go to https://10.10.11.108/lb/

I see something like this:

StickySession	Timeout	FailoverAttempts	Method
JSESSIONID	0	1 	byrequests

Worker URL	Route	RouteRedir	Factor	Set	Status	Elected	To	From
http://10.10.20.51:8080			0	0	Ok	19013	19M	143M
http://10.10.21.51:8080	node1		1	0	Ok	1602	752K	5.1M

If you do: /etc/init.d/httpd restart it works fine.
Comment 1 Jim Jagielski 2008-08-18 09:12:47 UTC
Please try with 2.2.9
Comment 2 Tautvydas 2009-03-09 12:10:27 UTC
Tried with 2.2.9 still not fixed. I believe my report is a duplicate of this:
https://issues.apache.org/bugzilla/show_bug.cgi?id=42621

For know my "simple solution" is perl wrapper around apache reload. Basically do apachectl graceful, read configs from a perl script and go to balance-manager web page to fix the routes.
Comment 3 William A. Rowe Jr. 2009-05-19 10:49:49 UTC
*** Bug 42621 has been marked as a duplicate of this bug. ***
Comment 4 Andrew 2011-01-26 20:03:50 UTC
Created attachment 26555 [details]
keepalive.py - Python script to keep an apache process alive indefinitely by using the keepalive issue
Comment 5 Glenn Nielsen 2011-03-28 10:49:56 UTC
I have seen the exact same problem with mod_proxy_balancer losing its routes when you do an apachectl graceful. Here is my relevant config:

ProxyPass /balancer-manager !

<Proxy balancer://webmail>
  BalancerMember http://boreas.sp:80 route=boreas loadfactor=1
  BalancerMember http://chinook.sp:80 route=chinook loadfactor=1
  BalancerMember http://zephyrus.sp:80 route=zephyrus loadfactor=1
  ProxySet lbmethod=byrequests
</Proxy>
ProxyPass / balancer://webmail/ stickysession=WEBMAILID
Comment 6 Glenn Nielsen 2011-03-28 10:53:24 UTC
I have seen the exact same problem with mod_proxy_balancer losing its routes when you do an apachectl graceful. Here is my relevant config:

ProxyPass /balancer-manager !

<Proxy balancer://webmail>
  BalancerMember http://boreas.sp:80 route=boreas loadfactor=1
  BalancerMember http://chinook.sp:80 route=chinook loadfactor=1
  BalancerMember http://zephyrus.sp:80 route=zephyrus loadfactor=1
  ProxySet lbmethod=byrequests
</Proxy>
ProxyPass / balancer://webmail/ stickysession=WEBMAILID

Oh, here is the server info (Server runs as a VM in ESX):

FreeBSD kottke.kinetic.more.net 7.3-RELEASE-p2 FreeBSD 7.3-RELEASE-p2 #0: Mon Jul 12 19:23:19 UTC 2010     root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64 

Server version: Apache/2.2.16 (FreeBSD)
Server built:   Aug 16 2010 15:38:53
Comment 7 Jim Jagielski 2011-08-04 14:49:40 UTC
WONTFIX - 2.2.x does not guarantee local changes via balancer-manager are kept
Comment 8 Robert Egglestone 2011-08-04 20:57:04 UTC
The comment on the WONTFIX above suggests that the issue has been misinterpreted.

The issue is not that local changes via balancer-manager are being lost, this behaviour is understood.

The issue is that by making change to the configuration file (no changes to balancer manager), and then doing a graceful restart, the routes can become offset in a way that no longer matches the configured servers.

For example, if I have this:

<Proxy balancer://balancer1/>
  BalancerMember http://host1 route=host1
  BalancerMember http://host2 route=host2
  BalancerMember http://host3 route=host3
</Proxy>

... and I comment out host1, then do a graceful restart, I may end up with Apache behaving like this ...

http://host2 route=host1
http://host3 route=host2


Some observed notes:

+ Problems appear when new balancer blocks are added or removed, or balancer 
members are added or removed.

+ The route names can even cross from one balancer configuration block to another!

+ It doesn't always happen, we see it on active systems.

+ A subsequent graceful restart usually fixes the routes.

+ When it does happen, it happens consistently - we have multiple httpd servers polling Subversion for configuration changes, then checking them out and doing a graceful restart. In this situation, all of our httpd servers tend to develop the same route problems at the same time.


This issue has various impacts on applications. In the worst case we saw it cause all traffic for a balancer to be directed to a single balancer member, which couldn't handle the load.
Comment 9 bugzilla 2012-04-02 20:15:09 UTC
Same here on a busy system changing between one active+hot spare backend servers configuration to a sticky load balanced setup activated by a reload causes node1 and node2 sessions to be routed to node1 first, but after a restart correctly to seperate nodes, which means half of the users loose their session info. Observed workaround is always to do a restart.
Comment 10 dario_uni 2012-04-13 22:53:22 UTC
Reproduced on OEL 5 with apache httpd 2.2.17.
I'll try to reproduce on 2.2.22.
Comment 11 tarun 2012-05-04 23:21:18 UTC
*** Bug 45950 has been marked as a duplicate of this bug. ***
Comment 12 krauseed 2012-10-18 01:00:10 UTC
I have been able to reproduce this on 2.2.22
Comment 13 Jim Jagielski 2012-11-01 14:34:12 UTC
Can you check w/ 2.4.x and/or trunk?
Comment 14 Frank 2014-03-13 13:47:20 UTC
Problem still exists with 2.2.24.

I have no idea about 2.4, though. That would be major change in our environment, which we don't intent to do.
Comment 15 William Lovaton 2014-08-02 00:59:55 UTC
Hi there,

Can this bug be fixed in the 2.2 branch?? I'm using RHEL 6 and it would be a huge and risky move to migrate to apache 2.4.

I have more than 100 virtual hosts in my Apache Reverse Proxy, some of them are balanced with mod_proxy_balancer to backend servers (some are JBoss AJP and some others are Apache/PHP) but the vast majority are simple ProxyPass directives.

The annoying thing with this bug is that sometimes I need to do little adjustments to the config (eg. add a new ProxyPass directive in an existing vhost), most of the time in vhosts not using the mod_proxy_balancer feature and when I do a graceful for the changes to take effect, sometimes the vhosts using mod_proxy_balancer forget the routes and suddenly all of the clients are redirected to the first server making half of the users lose their sessions and overloading one of the server while the other is idle.

Right now I'm using Apache 2.2.27 and the problem still happens.
Comment 16 Frank 2014-08-04 08:37:11 UTC
Yes, no one seems to care. Almost like using Oracle software.
Is at least someone reading this?
Comment 17 Jeff Trawick 2014-08-04 23:33:13 UTC
>Yes, no one seems to care. Almost like using Oracle software.
>Is at least someone reading this?

Ha ha ha!

The 2.2 answer was given previously:  WONTFIX - 2.2.x does not guarantee local changes via balancer-manager are kept

If you want to be constructive:

Can someone that needs it on the older release volunteer to confirm that it is resolved on 2.4, as requested some time ago?
Comment 18 William Lovaton 2014-08-04 23:45:34 UTC
(In reply to Jeff Trawick from comment #17)
> >Yes, no one seems to care. Almost like using Oracle software.
> >Is at least someone reading this?
> 
> Ha ha ha!
> 
> The 2.2 answer was given previously:  WONTFIX - 2.2.x does not guarantee
> local changes via balancer-manager are kept
> 
> If you want to be constructive:
> 
> Can someone that needs it on the older release volunteer to confirm that it
> is resolved on 2.4, as requested some time ago?

Thanks Jeff for the answer but that WONTFIX does not apply here.  It is obvious (for me at least) that changes done through the balancer-manager page won't be kept between restarts.  That's totally understandable.

The thing here is that BalancerMember configuration gets mangled somehow when you issue a graceful.  It sometimes swaps the route name or the loadfactor changes to 0 when the configuration file says loadfactor=1.  The end result of this is that all requests are now sent to one server only.

My workaround here is to always do a restart instead of a graceful but this is not really a good idea when hundreds of vhosts are going through this reverse proxy (the active requests will be lost abruptly).

Unfortunately no one using 2.2 here is able to confirm that 2.4 fixes the problem.  That at least would be an incentive to take the leap.
Comment 19 Jeff Trawick 2014-08-05 00:01:12 UTC
>The thing here is that BalancerMember configuration gets mangled 
>somehow when you issue a graceful.  It sometimes swaps the 
>route name or the loadfactor changes to 0 when the configuration 
>file says loadfactor=1.  The end result of this is that all 
>requests are now sent to one server only.

Thanks for reiterating that.  (I recall that shared memory setup is different for graceful restart vs. hard restart, but I haven't looked at this particular bug.)
Comment 20 Michael Bittorf 2014-08-05 05:56:01 UTC
We've tested 2.4 in our test environment and couldn't reproduce the reported behavior. Unfortunately it's not possible to update our production environment to 2.4 to validate this with real day by day usage.
Comment 21 Frank 2014-08-05 09:44:15 UTC
(In reply to William Lovaton from comment #18)
> (In reply to Jeff Trawick from comment #17)
> > If you want to be constructive:
> > 
> > Can someone that needs it on the older release volunteer to confirm that it
> > is resolved on 2.4, as requested some time ago?
> 
> Thanks Jeff for the answer but that WONTFIX does not apply here.  It is
> obvious (for me at least) that changes done through the balancer-manager
> page won't be kept between restarts.  That's totally understandable.
> 
> The thing here is that BalancerMember configuration gets mangled somehow
> when you issue a graceful.  It sometimes swaps the route name or the
> loadfactor changes to 0 when the configuration file says loadfactor=1.  The
> end result of this is that all requests are now sent to one server only.
> 
> My workaround here is to always do a restart instead of a graceful but this
> is not really a good idea when hundreds of vhosts are going through this
> reverse proxy (the active requests will be lost abruptly).

That's exactly the problem. For a restart I would have to take one server out of the load balancing, wait until sessions have finished, restart it, move on to the next server, and so on.
I just can't do that after every small config change.

> 
> Unfortunately no one using 2.2 here is able to confirm that 2.4 fixes the
> problem.  That at least would be an incentive to take the leap.

I can't reproduce in our dev environment, because we just don't have multiple backend servers there. I would need to move on production server to 2.4 and rewrite all the vhost configs. If this would be an option, I would have moved everything to 2.4 already.

If this won't be fixed, we just stick with mod_jk. It's buggy too, and handling of the configuration files sucks, but that's the only alternative.
Comment 22 jkaluza 2014-10-07 11:24:09 UTC
Created attachment 32086 [details]
proposed patch

Match the shared memory with workers according to their names.
Comment 23 jkaluza 2014-10-07 11:27:35 UTC
Created attachment 32087 [details]
proposed patch v2

Match the shared memory with workers according to their names.
Comment 24 Yann Ylavic 2014-10-08 10:47:37 UTC
Jan,
thanks for the patch which seems to work.
Maybe we could use the apr_md5() of the worker's name to save space in SHM?
Comment 25 jkaluza 2014-10-08 11:19:16 UTC
That's good idea. It would also fix the problem of >255 bytes long worker names. Expect the v3 soon.
Comment 26 jkaluza 2014-10-08 11:39:16 UTC
Created attachment 32091 [details]
proposed patch v3

Match the shared memory with workers according to MD5 hash of their names.
Comment 27 Yann Ylavic 2014-10-08 12:17:26 UTC
Thanks, looks good.

Isn't the ap_get_scoreboard_lb() call in init_balancer_members() also concerned by this?

Detail: apr_md5() does all the init/update/final in a one go
Comment 28 jkaluza 2014-10-08 12:24:35 UTC
(In reply to Yann Ylavic from comment #27)
> Thanks, looks good.
> 
> Isn't the ap_get_scoreboard_lb() call in init_balancer_members() also
> concerned by this?

Not sure I understand. I change that ap_get_scoreboard_lb to ap_proxy_get_scoreboard_lb.

> Detail: apr_md5() does all the init/update/final in a one go

Done in next attached patch. Thanks.
Comment 29 jkaluza 2014-10-08 12:25:21 UTC
Created attachment 32093 [details]
proposed patch v4

Match the shared memory with workers according to MD5 hash of their names. Now just with apr_md5().
Comment 30 Yann Ylavic 2014-10-08 12:33:10 UTC
(In reply to jkaluza from comment #28)
> (In reply to Yann Ylavic from comment #27)
> > Isn't the ap_get_scoreboard_lb() call in init_balancer_members() also
> > concerned by this?
> 
> Not sure I understand. I change that ap_get_scoreboard_lb to
> ap_proxy_get_scoreboard_lb.

I'm not sure either, but I think this should be done since the purpose is to determine whether the worker has already been initialized based on the status in SHM this time, so that lb parameters are not reset spuriously (below).
Comment 31 jkaluza 2014-10-08 13:34:18 UTC
I already have it in my patches (or am I missing something)?

-            slot = (proxy_worker_stat *) ap_get_scoreboard_lb(workers->id);
+            slot = (proxy_worker_stat *) ap_proxy_get_scoreboard_lb(workers);
Comment 32 Yann Ylavic 2014-10-08 15:10:13 UTC
(In reply to jkaluza from comment #31)
> I already have it in my patches (or am I missing something)?
> 
> -            slot = (proxy_worker_stat *) ap_get_scoreboard_lb(workers->id);
> +            slot = (proxy_worker_stat *)
> ap_proxy_get_scoreboard_lb(workers);

Sorry, it was based on your previous comment, and I didn't update the page to see the latest patch.

Maybe you can add a fast path in ap_proxy_get_scoreboard_lb() with something like :

+void *ap_proxy_set_scoreboard_lb(proxy_worker *worker) {
+    int i = 0;
+    proxy_worker_stat *free_slot = NULL;
+    proxy_worker_stat *s;
+    unsigned char digest[APR_MD5_DIGESTSIZE];
+
+    if (!ap_scoreboard_image) {
+        return NULL;
+    }
     if (worker->s) {
         return worker->s;
     }
+
+    apr_md5(digest, (const unsigned char *) worker->name,
+            strlen(worker->name));
+
+    /* Try to find out the right shared memory according to the hash
+     * of worker->name. */
+    while ((s = (proxy_worker_stat *)ap_get_scoreboard_lb(i++)) != NULL) {
+        if (memcmp(s->digest, digest, APR_MD5_DIGESTSIZE) == 0) {
             worker->s = s;
+            return s;
+        }
+        else if (s->status == 0 && free_slot == NULL) {
+            free_slot = s;
+        }
+    }
+
+    /* We failed to find out shared memory, so just use free slot */
     worker->s = free_slot;
+    return free_slot;
+}

so that the double call from init_balancer_members() does not hurt.
(Note that I renamed it ap_proxy_set_scoreboard_lb, according to the changes...)

Apologies (again) to propose partial things each time, this is the last one I hope.
Comment 33 jkaluza 2014-10-09 06:35:27 UTC
Created attachment 32095 [details]
proposed patch v5
Comment 34 Yann Ylavic 2014-10-09 11:56:24 UTC
Created attachment 32098 [details]
proposed patch v6

This version avoids using ap_proxy_set_scoreboard_lb() in init_balancer_members() when PROXY_HAS_SCOREBOARD is not defined, and sets worker->s (with palloc()ed memory) only if it was not set above with ap_proxy_set_scoreboard_lb().
Comment 35 Yann Ylavic 2014-10-09 12:30:23 UTC
Proposed for 2.2.x in r1630402.
Comment 36 Ferry Manders 2014-10-28 13:30:40 UTC
We have tried the patch first in our test environment without any noticeable issues. Today we rolled out the patch with Apache 2.2.29 on our production systems and noticed that the first worker in the balancer was favoured.

We noticed that the first worker received about 90% or all the requests and the second worker received the leftover 10% while the third and fourth worker received almost no requests.

The view of one of our balancer-manager's

Worker URL	Route	RouteRedir	Factor	Set	Status	Elected	To	From
http://xrtv1afp	xrtv1a			0	0	Ok	25944	20M	223M
http://xrtv1bfp	xrtv1b			0	0	Ok	2077	2.2M	24M
http://xrtv1cfp	xrtv1c			0	0	Ok	196	193K	1.9M
http://xrtv1dfp	xrtv1d			0	0	Ok	278	274K	1.6M

We noticed this behaviour on all our updated frontproxies, which range around 150~200 apache instances.

After rolling back the patch and staying on the 2.2.29 release the issue was gone.
Comment 37 Yann Ylavic 2014-10-28 14:28:40 UTC
According to the balancer-manager, the lbfactor seems to be 0. That shouldn't happen.

Are your balancer members also used as standalone workers (eg. same URL used before in a ProxyPass or <Proxy> section in your configuration)?
Comment 38 William Lovaton 2014-10-28 14:55:41 UTC
(In reply to Yann Ylavic from comment #37)
> According to the balancer-manager, the lbfactor seems to be 0. That
> shouldn't happen.
> 
> Are your balancer members also used as standalone workers (eg. same URL used
> before in a ProxyPass or <Proxy> section in your configuration)?

That's happening to me too after installing a test package for RHEL 6.  There is also another problem I just noticed after the update:

I have two <Proxy balancer> directives in my config for the same domain, one for port 80 and another one for port 443 (the secure connection is not mandatory yet) and before applying the patch both balancer-manager pages showed independent values and stats for plain and secure, now they are showing exactly the same values.  In my case the secure connection used to receive a lot less connections than the unsecure one.

The config for port 80 is this one:

      Header add Set-Cookie "ROUTEID=.%{BALANCER_WORKER_ROUTE}e; path=/" env=BALANCER_ROUTE_CHANGED
      <Proxy balancer://ciklos-balancer>
         BalancerMember http://cdplin25.coomeva.nal:80 route=web1 loadfactor=1 retry=0
         BalancerMember http://cdplin26.coomeva.nal:80 route=web2 loadfactor=1 retry=0

         ProxySet stickysession=ROUTEID
         ProxySet nofailover=On
         ProxySet lbmethod=bybusyness
      </Proxy>

      ProxyPass /balancer-manager !
      ProxyPass / balancer://ciklos-balancer/
      ProxyPassReverse / balancer://ciklos-balancer/



And config for port 443 is the following (the only difference is the balancer name):

      Header add Set-Cookie "ROUTEID=.%{BALANCER_WORKER_ROUTE}e; path=/" env=BALANCER_ROUTE_CHANGED
      <Proxy balancer://ssl-ciklos-balancer>
         BalancerMember http://cdplin25.coomeva.nal:80 route=web1 loadfactor=1 retry=0
         BalancerMember http://cdplin26.coomeva.nal:80 route=web2 loadfactor=1 retry=0

         ProxySet stickysession=ROUTEID
         ProxySet nofailover=On
         ProxySet lbmethod=bybusyness
      </Proxy>

      ProxyPass /balancer-manager !
      ProxyPass / balancer://ssl-ciklos-balancer/
      ProxyPassReverse / balancer://ssl-ciklos-balancer/


Also note that the loadfactor is 1 but the balancer-manager shows 0 in both cases even after a hard stop/start sequence.
Comment 39 Ferry Manders 2014-10-28 18:48:02 UTC
(In reply to Yann Ylavic from comment #37)
> According to the balancer-manager, the lbfactor seems to be 0. That
> shouldn't happen.
> 
> Are your balancer members also used as standalone workers (eg. same URL used
> before in a ProxyPass or <Proxy> section in your configuration)?

We use this type of configuration.
<Proxy balancer://xrtv1cluster>
        BalancerMember  http://xrtv1afp route=xrtv1a
        BalancerMember  http://xrtv1bfp route=xrtv1b
        BalancerMember  http://xrtv1cfp route=xrtv1c
        BalancerMember  http://xrtv1dfp route=xrtv1d
        ProxySet        stickysession=xrtv1balanceid
</Proxy>

and then we proxypass through the balancer.

ProxyPass       /       balancer://xrtv1cluster/

However we do request the server-status page every now and then directly from the worker.
The lbfactor should be the default (1) and i see that currently (without the patch) it shows this aswel :

Worker URL	Route	RouteRedir	Factor	Set	Status	Elected	To	From
http://xrtv1afp	xrtv1a			1	0	Ok	9914	9.6M	91M
http://xrtv1bfp	xrtv1b			1	0	Ok	9926	9.4M	93M
http://xrtv1cfp	xrtv1c			1	0	Ok	10063	9.6M	102M
http://xrtv1dfp	xrtv1d			1	0	Ok	9943	10M	94M
Comment 40 Yann Ylavic 2014-10-28 23:35:08 UTC
Created attachment 32159 [details]
proposed patch v7

Thanks for testing and reporting the defect.

The previous patch missed the (needed) unicity per vhost and per balancer for the workers and balancer members.

This new patch should fix this.
Could you please give it a try?
Comment 41 Yann Ylavic 2014-10-28 23:48:04 UTC
(In reply to Yann Ylavic from comment #40)
> The previous patch missed the (needed) unicity per vhost and per balancer
> for the workers and balancer members.
s/unicity/uniqueness/
Comment 42 William Lovaton 2014-10-28 23:53:35 UTC
Thanks Yann for your help.

Does it solve the problem about the load factor being 0 even when it's explicitly set to 1 in the config file?
Comment 43 Yann Ylavic 2014-10-29 01:23:23 UTC
(In reply to William Lovaton from comment #42)
> Does it solve the problem about the load factor being 0 even when it's
> explicitly set to 1 in the config file?

Yes it should, the balancer member was initialized as a normal worker without the specific balancer parameters (hence those were all 0).

I'd appreciate you can verify this with your configuration though.
Comment 44 Ferry Manders 2014-10-29 08:47:47 UTC
we've implemented the new patch in our test environment and used ApacheBenchmark to test the system. 
As far as we can currently see the new patch works more as intended
The lbfactor is set to 1 and also the requests are balanced evenly over the 2 workers.


LoadBalancer Status for balancer://xrtv1cluster

StickySession		Timeout	FailoverAttempts	Method
balancer://xrtv1cluster	0	1			byrequests

Worker URL		Route	RouteRedir	Factor	Set	Status	Elected	To	From
http://xrtv1afp-test	xrtv1a			1	0	Ok	1776	473K	626K
http://xrtv1bfp-test	xrtv1b			1	0	Ok	1775	473K	626K
Comment 45 Yann Ylavic 2014-10-29 09:44:10 UTC
(In reply to Ferry Manders from comment #44)
> As far as we can currently see the new patch works more as intended
> The lbfactor is set to 1 and also the requests are balanced evenly over the
> 2 workers.

Thanks for testing, I'll propose this new patch instead of the previous one for 2.2.x backport.
Comment 46 Yann Ylavic 2014-10-29 10:09:04 UTC
Backport proposal (2.2.x) updated in r1635084.
Comment 47 Yann Ylavic 2015-07-16 07:39:13 UTC
Fixed in 2.2.30 (r1680920).