Bug 48037 - mod_proxy_http does not handle asynchronous keepalive close events correctly
Summary: mod_proxy_http does not handle asynchronous keepalive close events correctly
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: mod_proxy_http (show other bugs)
Version: 2.0.63
Hardware: PC Linux
: P2 normal with 2 votes (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
Keywords: MassUpdate
Depends on:
Reported: 2009-10-21 16:35 UTC by John Lightsey
Modified: 2018-11-07 21:08 UTC (History)
4 users (show)

server config files (3.82 KB, application/zip)
2016-01-11 11:30 UTC, Helen
Server Config files (3.82 KB, application/zip)
2016-01-11 11:32 UTC, Helen

Note You need to log in before you can comment on or make changes to this bug.
Description John Lightsey 2009-10-21 16:35:38 UTC
Noticed this with Apache 2.0.63, but it looks fairly clear that the same problem exists in 2.2.14.

The way proxy_httpd.c is written, if a keepalive connection is closed by the other side in ap_proxy_http_handler() after ap_proxy_http_create_connection() and before a response is received, Apache will consider this to be a connection error and generate a 502 response.

This logic is only valid on the first request across the keepalive.  According to the HTTP 1.1 spec, proxy_httpd.c must be capable of handling asynchronous close events in the middle of a keepalive session.  When it encounters an unexpected close it should attempt to reestablish the connection and resend the request before generating a 502 response.
Comment 1 Ruediger Pluem 2012-08-10 06:39:44 UTC
Please check a recent 2.2.x / 2.4.x. This should have the problem fixed.
Comment 2 William Lovaton 2013-04-13 18:16:43 UTC
Sorry, but this problem is still happening. I'm testing httpd 2.2.23 from http://centos.alt.ru/repository/centos/6/x86_64/ repository on a RHEL 6.4 server.  I was seeing the same problem from official RHEL packages (httpd-2.2.15-26.el6.x86_64.rpm) and updating to 2.2.23 didn't solve the problem.

My configuration is as follows:
* Reverse Proxy: RHEL 6.4 with apache 2.2.23 from CentosALT (Worker MPM)
* Backend Servers: 2 x RHEL 6.2 with apache 2.2.15-15.el6.x86_64 from RHEL (Prefork MPM, huge in-house PHP 5 web app)

I'm using mod proxy balancer to spread the load between the 2 servers with the following configuration:

ProxyPreserveHost On

<Proxy balancer://ciklos-balancer>
   BalancerMember route=web1 redirect=web2 loadfactor=1 retry=0
   BalancerMember route=web2 redirect=web1 loadfactor=1 retry=0

   ProxySet stickysession=ROUTEID
   ProxySet nofailover=On

ProxyPass /balancer-manager !
ProxyPass /server-status !
ProxyPass / balancer://ciklos-balancer/
ProxyPassReverse / balancer://ciklos-balancer/

I'm also using mod_disk_cache to cache static data in the Reverse Proxy and the backend servers performs on the fly compression with mod_deflate.

From the Reverse Proxy point of view the request is always the first one in the connection, which is keep alive 0 according to %k LogFormat and the response code is HTTP 502.  Now, these requests never gets to the backend server, inspecting both logs I can see the HTTP 502 request in the RP but there is no corresponding request in the backend server.

The error log shows the following 2 entries:
[Sat Apr 13 12:27:41 2013] [error] [client] (20014)Internal error: proxy: error reading status line from remote server, referer: http://www.ciklos.com.co/ciklos/php/vista/odontologia/hcOdontologica.php
[Sat Apr 13 12:27:41 2013] [error] [client] proxy: Error reading from remote server returned by /ciklos/ips.php, referer: http://www.ciklos.com.co/ciklos/php/vista/odontologia/hcOdontologica.php

The problem can be workarounded setting "force-proxy-request-1.0" and "proxy-nokeepalive" but then I lose the performance benefit from Keep Alive connections.  According to the logs I can get up to 190 keep-alive requests on the same connection from the Reverse Proxy to the backend server (not bad for a low traffic day like Saturdays).

There is also the option of using "proxy-initial-not-pooled" but again there is a performance penalty if you use it.

The problem doesn't happen so often. On a low traffic day like today I have a total of 432 HTTP 502 requests from a 1'300.000 requests sample.  But still it's something that worries me since it can annoy my users.

We are trying to replace an old NetScaler appliance which seems to handle Keep Alive connections just fine.

Isn't there a way for mod_proxy_http to retry the request when this error happens? or maybe something else?

Thanks a lot for your time.
Comment 3 Simon Oberhammer 2013-07-31 14:53:30 UTC
We too are seeing this error on 2.2.23

Our ProxyPass setup is very simple:

    ProxyPass  /something  http://localhost:8080/somethingelse/

We have a high traffic site and get several of those errors every day:

(20014)Internal error: proxy: error reading status line from remote server localhost:8080
proxy: Error reading from remote server returned by /proxy/foo/bar

The proxy backend is a jetty6, which has a connection timeout of 200 seconds. We added "ttl=100" hoping this would mitigate the problem but I see no change.

"proxy-initial-not-pooled" is not a solution for us, since we have KeepAlive disabled in apache (as i understand it, setting this env would thus disable the pooling completely?)
Comment 4 jkaluza 2013-10-01 09:57:46 UTC
It would be interesting to retest it with httpd-2.4. There are some changes in mod_proxy_http which could fix this issue.
Comment 5 Helen 2016-01-11 11:30:34 UTC
Created attachment 33423 [details]
server config files
Comment 6 Helen 2016-01-11 11:32:10 UTC
Created attachment 33424 [details]
Server Config files
Comment 7 Helen 2016-01-11 11:33:31 UTC
We are currently load testing a apache cacheing proxy server as part of an aws system.

Apache version : Apache/2.4.16 (Amazon) (Server built: Aug 13 2015 23:52:13)

During testing we would get the following errors intermittently:

[Sat Jan 09 15:15:00.220878 2016] [proxy_http:error] [pid 2538:tid (20014)Internal error: [client] AH01102: error reading status line from remote server xxxxxxxxxxx.s3-website-us-east-1.amazonaws.com:80

these errors result in 504 errors from the Amazon elbs

Setting the following fixed the problem:
 SetEnv force-proxy-request-1.0 1
 SetEnv proxy-nokeepalive 1

If I am not mistaken this is the same problem as noted previously so it is still present in 2.4

attached config files
Comment 8 Yann Ylavic 2016-01-11 12:13:54 UTC
Does it happen if you configure a TTL for the proxy connections?

For example: ProxyPass / http://my.example.com/" ttl=<TTL>
where <TTL> is lower (say 1 second) than the KeepAliveTimeout configured on the backend server.
Comment 9 Yann Ylavic 2016-01-11 12:19:57 UTC
(In reply to Yann Ylavic from comment #8)
> For example: ProxyPass / http://my.example.com/" ttl=<TTL>
Please ignore this spurious double-quote         ^
Comment 10 Helen 2016-01-11 12:34:29 UTC
(In reply to Yann Ylavic from comment #9)
> (In reply to Yann Ylavic from comment #8)
> > For example: ProxyPass / http://my.example.com/" ttl=<TTL>
> Please ignore this spurious double-quote         ^

As the backend server is aws s3 there is no configuration for it so I don't know what the keepalive timeout is for this. I could make a guess that it is set to something like 60 sec but I have no idea.
Comment 11 Yann Ylavic 2016-01-11 12:51:37 UTC
Well, I guess in that case a guess is needed.
There is nothing mod_proxy can do to avoid a race condition when the backend closes the connection while the next request is being sent on that same connection (and it can't either send the same request twice because requests are often non-idempotent).
Thus this must be addressed by configuring the correct TTL at the proxy, that is value lower than the keepalive timeout used at the backend, so that the race condition never happens.
Since the KeepaliveTimeout is an idle timeout (in between requests), it is generally lower than the 60s you mention, so I'd suggest something like 4s to start with (for the proxy's TTL), and see if that helps...
Comment 12 Helen 2016-01-11 13:02:20 UTC
Ok I will try that in a bit - in the middle of an hour long load run at the moment--
Comment 13 Helen 2016-01-11 18:36:45 UTC
Ran with ttl=5 and ttl=1 both generated errors,
Only setting:

SetEnv force-proxy-request-1.0 1
SetEnv proxy-nokeepalive 1

Stops the errors
Comment 14 Jim Jagielski 2016-03-07 21:25:37 UTC
You could also setup IgnoreErrors, which should automatically retry the current connection.
Comment 15 vin01 2016-08-01 03:04:50 UTC
I am getting intermittent failures with errors like :-

[Sun Jul 31 10:53:17.579566 2016] [proxy_http:error] [pid 42678] (103)Software caused connection abort: [client <ip>:<port>] AH01102: error reading status line from remote server <ip>:<port>, referer: <url>

Other than this i also keep getting errors like :-

[Sun Jul 31 21:06:11.629845 2016] [proxy_http:error] [pid 19153] (104)Connection reset by peer: [client <client_ip>:<port>] AH01095: prefetch request body failed to <backend_ip>:<port> (<backend_ip>) from <client_ip> (), referer: <url>

it happens for like 1 request in a 1000.

I am running httpd 2.4.18 on CentOS 6.8 and Back-end is jetty 9.3 which has support for HTTP/1.0, HTTP/1.1.

I added "SetEnv proxy-nokeepalive 1" to disable keepalives, but i am still having these errors.

Should i also add?

"SetEnv force-proxy-request-1.0 1"

May I know why it is suggested to force use http/1.0 when the issue is just with keepalives?

Any other suggestions to avoid these errors?

Comment 16 William A. Rowe Jr. 2018-11-07 21:08:48 UTC
Please help us to refine our list of open and current defects; this is a mass update of old and inactive Bugzilla reports which reflect user error, already resolved defects, and still-existing defects in httpd.

As repeatedly announced, the Apache HTTP Server Project has discontinued all development and patch review of the 2.2.x series of releases. The final release 2.2.34 was published in July 2017, and no further evaluation of bug reports or security risks will be considered or published for 2.2.x releases. All reports older than 2.4.x have been updated to status RESOLVED/LATER; no further action is expected unless the report still applies to a current version of httpd.

If your report represented a question or confusion about how to use an httpd feature, an unexpected server behavior, problems building or installing httpd, or working with an external component (a third party module, browser etc.) we ask you to start by bringing your question to the User Support and Discussion mailing list, see [https://httpd.apache.org/lists.html#http-users] for details. Include a link to this Bugzilla report for completeness with your question.

If your report was clearly a defect in httpd or a feature request, we ask that you retest using a modern httpd release (2.4.33 or later) released in the past year. If it can be reproduced, please reopen this bug and change the Version field above to the httpd version you have reconfirmed with.

Your help in identifying defects or enhancements still applicable to the current httpd server software release is greatly appreciated.