Bug 59319 - ProxyPass connectiontimeout not honored with specific target url
Summary: ProxyPass connectiontimeout not honored with specific target url
Status: RESOLVED DUPLICATE of bug 59373
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: mod_proxy (show other bugs)
Version: 2.4-HEAD
Hardware: PC All
: P2 normal (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-04-14 10:46 UTC by Ben RUBSON
Modified: 2016-04-25 09:18 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ben RUBSON 2016-04-14 10:46:42 UTC
Hello,

I think I have found a bug around ProxyPass connectiontimeout parameter.

Here is the test config :

RewriteEngine On
RewriteMap chooseproxy prg:/chooseproxy.pl
RewriteCond ${chooseproxy:%{THE_REQUEST}} ^([0-9]{4,5})$
RewriteRule (.*) - [L,E=proxytouse:127.0.0.1:%1]
ProxyPassInterpolateEnv on
ProxyPass "/api/" "https://${proxytouse}/api/" interpolate connectiontimeout=5

Depending on the request, chooseproxy.pl returns the proxy port to connect to.
It works fine, however connectiontimeout is not honored : if port is not reachable, Apache only times-out after timeout seconds (the global timeout parameter).

If I change my ProxyPass rule removing the ${proxytouse} variable, for example :
ProxyPass "/api/" "https://127.0.0.1:1234/api/" interpolate connectiontimeout=5
Then connectiontimeout parameter works correctly.

Thank you,

Best regards,

Ben
Comment 1 Rainer Jung 2016-04-14 17:15:21 UTC
You scheme/host/port part of the target URI is "http://${proxytouse}". Any setting added to this ProxyPass via key=value will be saved under this worker name. Later during request runtime, the "${proxytouse}" will first be resolved to a real hostname plus port, say host1 and port1. Then a worker is being looked up under this hostname and port, ie. "https://host1:port1". Since no worker like that was defined in the config, the default reverse proxy worker will be used, which doesn't have these key=value settings applied (and you'll also probably not get connection pooling).

If you need to work with host and port provided by a script, and the list of possible hosts plus port pairs is not too long to maintain, you should configure the workers for those backends, so you get connection pooling an the actual key=value settings you like.

Example: Suppose myhost:myport is one of the values returned by you script, then add

<Proxy "https://myhost:myport">
  ProxySet connectiontimeout=5 timeout=30
</Proxy>

to your config (and you can remove those key=value parts from you interpolated ProxyPass. I think it's useless there.

You can get a bit more insight about which workers are used etc. by setting LogLevel to trace8 or at least the LogLevel of mod_proxy to trace8 ("LogLevel info proxy:trace8").

If you don't like repeating those <Proxy> blocks for all you backends, you can define a macro using mod_macro:

<Macro MyWorker $hostAndPort>
  <Proxy "https://${hostAndPort}">
    ProxySet connectiontimeout=5 timeout=30
  </Proxy>
</Macro>

and then use the macro:

Use MyWorker host1:port1
Use MyWorker host2:port2
...

That reduces the redundancy in repeating the communication settings for each Worker. The macro name "MyWorker" can be replaced by something more meaningful for you.

If the number of host/port pairs is really huge, then it is possible that you really don't want pooling, because reusing existing connections before they time out might be rare. In that case using the default reverse proxy worker for all those backend connections should be OK and you wouldn't explicitly define workers. Unfortunately IMHO we currently lack a way of defining worker params (key=value) as default for all workers and/or for the default (reverse) proxy worker. So in that case there's be no way of defining e.g. the connectiontimeout etc.

Whether the fact, that using key=value does not work, if the worker has scheme, host name or port interpolated is a bug or not can be discussed. I'd say from teh current implementation it is not expected to work, but we could maybe log a warning if you try to use that pattern.

Let's see what others comment.
Comment 2 Ben RUBSON 2016-04-15 07:00:06 UTC
Rainer,

First of all, thank you very much for your long, detailed, precise answer...
I really appreciate it !

So undestood, "https://${proxytouse}" will never be targeted by a resolved worker, "https://host1:port1" in your example.
Let's asume it would, some parameters such as connection pooling in a worker like "https://${proxytouse}" would not really makes sense, as in reality it would address many different addresses/ports.
To work, a new worker (a clone of the default worker with specific user defined parameters applied) would have to be created internally each time the generic one is resolved to a new value.

Unfortunately as you said there is no possibility to enforce default values for the default worker.
Once again some values would certainly not make sense for the default worker (connection pooling...), but some of them, such as for example connectiontimeout, timeout... would be worth it.

In my use case, I have many host/port pairs which vary upon the system's configuration and furthermore change over time.
So a uniq configuration with variables is very practical.

I could think about the following : each time a new host/port pair arrives on the server, automatically create its small configuration file containing only the following macro line :
Use MyWorker hostX:portX
And reload the Apache configuration.
Macro would be defined in the main configuration.
The main configuration would also contain something like :
Include /path/to/workers/configurations/worker.*.conf

Of course being able to define default parameters for the default worker (or for its clones) would be even easier / cleaner.

That's another story but, declaring a worker for each host:port would activate connection pooling for each one of them.
I think connection pooling would be work it in my use case : each time a user connects to the service, it is forwarded to the same reserve proxy. So connection pooling could be a good thing to have.
However, can connection pooling have an impact in terms of performance if Apache has to manage many workers ?
Can we tend to somethig like denial of service if the maximum number of connection in pools is reached, and no other connections to other workers can be made ?
Or is there no limit at all here ?

Being able to define default worker (or clones of worker) parameters would also improve P flag of mod_rewrite.

Thank you again,

Best regards,

Ben
Comment 3 Rainer Jung 2016-04-15 09:58:22 UTC
(In reply to Ben RUBSON from comment #2)

> Unfortunately as you said there is no possibility to enforce default values
> for the default worker.
> Once again some values would certainly not make sense for the default worker
> (connection pooling...), but some of them, such as for example
> connectiontimeout, timeout... would be worth it.

Yes and I think I have to fully very this (I checked the code but only reading it) and if it is true, then being able to either define setting for the default worker or being able to set default settings for all worker including the default would be a useful enhacement, probably both of it. Your use case of handling many backends and deciding between them based on policy/convention/table is happening more and more out there.

> I could think about the following : each time a new host/port pair arrives
> on the server, automatically create its small configuration file containing
> only the following macro line :
> Use MyWorker hostX:portX
> And reload the Apache configuration.
> Macro would be defined in the main configuration.
> The main configuration would also contain something like :
> Include /path/to/workers/configurations/worker.*.conf

I know installations who do it exactly like that.

> That's another story but, declaring a worker for each host:port would
> activate connection pooling for each one of them.
> I think connection pooling would be work it in my use case : each time a
> user connects to the service, it is forwarded to the same reserve proxy. So
> connection pooling could be a good thing to have.

Yes, if there would be reuse, ie. often the next request for the same backend hits before the connection runs into its HTTP keep-alive timeout (especially the one set by the backend!).

> However, can connection pooling have an impact in terms of performance if
> Apache has to manage many workers ?

I expect no problem in case of say less than 100 workers. I don't remember reports for any numbers, but if you have many more workers, than it might be, that there's not so much relevant experience out there and some testing would be recommended.

> Can we tend to somethig like denial of service if the maximum number of
> connection in pools is reached, and no other connections to other workers
> can be made ?
> Or is there no limit at all here ?

Pools are always local in Apache processes. Processes do not share connections to backends between themselves. The limit (maximum number of connections per backend) is equals to the number of threads in the process. Since Apache can't handle more requests in one process at one point in time than it has threads, and we allow the same number of backend connections, you should never run into a connection limitation unless you reduce the limit by configuring a smaller limit. Pooling here is more about reuse than limiting.

If you would use it to limit the access to the backend via a threshold, the limit would be per process and you can't control which request ends up in which process. So limiting access via the connection pool would not be very precise. There's a "busy" counter which is shared between the Apache processes and knows how many requests are in-flight on the backends (more precisely those that come from this Apache) and one could limit using the busy counter to protect the backend, but I think (not 100% sure) we don't yet spport limiting based on the busy value. 

> Being able to define default worker (or clones of worker) parameters would
> also improve P flag of mod_rewrite.

Yup. Maybe we sould add another Bugzilla as a feature request for adding a way to configure the default workers (forward, reverse) and to cinfigure defaults for all workers. Both should be per VHost and values would be inherited from global server to VHosts.
Comment 4 Ben RUBSON 2016-04-15 12:45:51 UTC
(In reply to Rainer Jung from comment #3)
> 
> (...) being able to either define setting for
> the default worker or being able to set default settings for all worker
> including the default would be a useful enhacement, probably both of it.

> Maybe we sould add another Bugzilla as a feature request for adding a
> way to configure the default workers (forward, reverse) and to cinfigure
> defaults for all workers. Both should be per VHost and values would be
> inherited from global server to VHosts.

This would definitely work for parameters such as connectiontimeout.
But do you think it would work for connection pooling, without having to individually declare each target worker (as we need for the moment) ?
ie. would Apache be "smart" enough to start a worker and its connection pool for every new proxy it deals with, and if the worker has not been literally declared in the configuration, to apply the default configuration (which could then enable by default or at least contain connection pooling settings) ?

> Pools are always local in Apache processes. Processes do not share
> connections to backends between themselves. The limit (maximum number of
> connections per backend) is equals to the number of threads in the process.
> Since Apache can't handle more requests in one process at one point in time
> than it has threads, and we allow the same number of backend connections,
> you should never run into a connection limitation unless you reduce the
> limit by configuring a smaller limit. Pooling here is more about reuse than
> limiting.

So if for any reason prefork MPM is used, connection pool only has one connection :)
Understood, perfectly clear, many thanks Rainer !
Comment 5 Ben RUBSON 2016-04-20 08:36:40 UTC
In addition Rainer, should we definitely open a new Bugzilla as a feature request ?
Thank you again,
Ben
Comment 6 Ben RUBSON 2016-04-25 09:18:55 UTC
Enhancement submitted here :
https://bz.apache.org/bugzilla/show_bug.cgi?id=59373

Thank you !

*** This bug has been marked as a duplicate of bug 59373 ***