Bug 51814

Summary: mod_proxy in Apache HTTP 2.2 FIN_WAIT2 in server side, it leaves as CLOSE_WAIT for a long time in mod_proxy side.
Product: Apache httpd-2 Reporter: whatcher <wendell_hatcher>
Component: mod_proxyAssignee: Apache HTTPD Bugs Mailing List <bugs>
Status: RESOLVED LATER    
Severity: normal CC: hendrik.harms, me, sebastien.allamand
Priority: P2 Keywords: MassUpdate
Version: 2.2.20   
Target Milestone: ---   
Hardware: Other   
OS: Linux   
Bug Depends on: 50807    
Bug Blocks:    

Description whatcher 2011-09-14 20:49:53 UTC
if mod_proxy, but during being FIN_WAIT2 in server side, it leaves as CLOSE_WAIT for a long time in mod_proxy side.

This might be only a small bad effect for this phenomenon, but we think this occurs because of  not preferred implementation of Apache httpd mod_proxy.
Specifically, mod_prxoy sends FIN by KeepAliveTimeout from backend server side. When it gets it, it returns using FIN and ACK, then wants to release ports that is in use.
 
This is because of the following reasons:
*Keeping unnecessary resource for a long time, this might occur some unexpected bugs in the future.
*If CLOSE_WAIT condition continues for a long time and then there is Firewall between mod_proxy_http and backend, then you have to keep unnecessary session and this might affect communication session.
*If there is FIN and ACK that after there is a long time gap, then it would already be after Firewall destroys that session and Firewall might show warning messages.  
  
Also in mod_proxy in Apache 2.0, if client doesn't use KeepAlive, the connection between mod_proxy and backend server ends, and confirms that CLOSE_WAIT doesn’t stays.   In short, Apache 2.2 doesn't behave good than Apache 2.0.
When we compare Apache 2.2 and 2.0 source, in Apache 2.0 mod_proxy, client side TCP session extension of the ending process, it closes TCP session between backend server using apr_socket_close() function. However, in Apache 2.2 mod_proxy_http, it changes as to call connection_clceanup() or socket_cleanup(), and we thiknk this is because it doesn’t do apr_socket_close() in the function. In short, 2.2 doesn't close the session that is immediately close when KeepAlive is invalid. We assume that this is a simple bug that forgets to close when mod_proxy’s refactoring.




(1) Client is HTTP/1.0 and KeepAlive is none, so every time the connection ends by FIN. 

(2) mod_proxy_http doesn’t disconnect after receiving the result of backend server  request.

(3) Backend server though FIN at 30.007sec by KeepAliveTimeout. mod_proxy_http doesn’t return FIN for this. 

Port 47875 of mod_proxy_http becomes CLOSE_WAIT after this.   
(4) New request reaches mod_proxy_http at 41.547sec and creating a different new TCP section at 41.546sec. This is also throwing FIN and then disconnecting it, but it’s NOT disconnecting in the backend server side. However, it’s disconnecting from backend server side at 71,545, but mod_proxy_http doesn’t return FIN. After this, Port 47486 of mod_proxy_http becomes CLOSE_WAI.T

(5) Client throws a new request to mod_proxy_http at 77.157sec. At this time, at 77.159sec, mod_proxy_http thows FIN and ACK from the above # (3), port 47485, then the first time that (3) session ends here.  It took 47 seconds until here, and if we compare it with KeepAliveTimeout that is set at the backend server, there is a big gap.
 
We have done it a few times and found out the following:  
a) mod_proxy_http uses KeepAlive between backend, although client doesn't use KeepAlive.

b) Even if Backend send FIN by KeepAliveTimeout, mod_proxy_http doesn’t response and become CLOSE_WAIT.

c) mod_proxy_http becomes CLOSE_WAIT when a new request recives.

d) However, if a new request doesn’t come then it never sends FIN to an old connection and stays as CLOSE_WAIT forever.
 
We assume that b) and d) are not good behaviors for TCP/IP connections.  Already connection to client is disconnected; it doesn’t depend on client’s KeepAlive behavior.
Comment 1 sebmoule 2012-10-17 09:56:51 UTC
Hello, Is there any news about this bug ?
Comment 2 Eric Covener 2012-10-17 11:43:21 UTC
NEEDINFO = info needed from originator
Comment 3 Eric Covener 2012-10-17 12:03:16 UTC
I'm reclassifying this an enhancement -- check for closed idle connections in the pool more frequently or asynchronously. The fact that they may sit in a closed state before being noticed is not itself a defect IMO.
Comment 4 Niklas Keller 2018-03-28 17:38:27 UTC
This is very much a bug IMO. If there's no TTL set on the ProxyPass directive and keep-alive is enabled on the origin server, there will be lots and lots of connections in CLOSE_WAIT. I guess if there are multiple backends this could easily exhaust the pool of ephemeral ports? Apache should set the TTL automatically to the timeout of the keep-alive header if it exists IMO.

See https://i.imgur.com/5pvBYwe.png for what happens if the origin enables keep-alive connections.
Comment 5 Eric Covener 2018-03-28 18:18:29 UTC
> Apache
> should set the TTL automatically to the timeout of the keep-alive header if
> it exists IMO.

I don't think this is feasible. the ttl is unfortunately a property of the pool at creation time (child process startup), not in each connection object in the pool such that it could be updated after it sees some transactions.

For the original issue, no TTL only:

There is a bias towards reusing the most recently returned connections, 
because the resource list API the connection pool is implemented with
is stack-like.  On a busy server, the deep end of the pool will
never be looked at.

Exacerbating this -- when mod_proxy does get unlucky and find a dead 
conn, instead of cycling through a bunch or items in the list
it just creates a new TCP connection for the same pooled object.
Comment 6 Niklas Keller 2018-03-28 18:54:18 UTC
Adding 'min=0 max=20 smax=5 ttl=15' to 'ProxyPass' didn't help. The connections in CLOSE_WAIT still grew and grew, see https://i.imgur.com/0otdDon.png
Comment 7 Yann Ylavic 2018-03-28 21:35:41 UTC
(In reply to Niklas Keller from comment #6)
> Adding 'min=0 max=20 smax=5 ttl=15' to 'ProxyPass' didn't help. The
> connections in CLOSE_WAIT still grew and grew

Is this ttl below the backend's KeepAliveTimeout (e.g. 1 second below)?

If so the number of CLOSE_WAIT sockets shouldn't grow, and even be near zero whenever the traffic increases (with MaxSpareThreads doing its job).
Comment 8 Hendrik Harms 2018-03-29 08:40:16 UTC
The count of CLOSE_WAIT sockets also depends on the chosen mpm-module because the socket pool for backend connection belongs to the process. mpm_prefork handles only one connection per process. So all the other backend connections in the pool may time out and run into state CLOSE_WAIT. They will persist until the same backend will be addressed again or the process was killed. So mpm_prefork will cause a high number of CLOSE_WAIT sockets if you have many different ProxyPass in you config and high settings of MaxSpareServers and MaxRequestWorkers/ServerLimit.

e.g.: I've placed a comparison in Bug 50807 between old apache-1.3 and apache-2.4 (mpm_prefork)

mpm_worker should have less CLOSE_WAITs cause the chance that a pooled connection will be reused before timed out is much higher. Each thread of one worker process should have access to the whole backend connection pool of its worker process.
Comment 9 Yann Ylavic 2018-03-29 09:21:39 UTC
(In reply to Hendrik Harms from comment #8)
> 
> e.g.: I've placed a comparison in Bug 50807 between old apache-1.3 and
> apache-2.4 (mpm_prefork)
I added a comment there as to the "why".

> 
> mpm_worker should have less CLOSE_WAITs cause the chance that a pooled
> connection will be reused before timed out is much higher. Each thread of
> one worker process should have access to the whole backend connection pool
> of its worker process.
I think that MaxSpareThreads (event/worker) or MaxSpareServers (prefork) is the key to lower the number of "unused" threads/processes (hence potential CLOSE_WAITs), *once* the ttl is fine tuned.
Comment 10 William A. Rowe Jr. 2018-11-07 21:09:17 UTC
Please help us to refine our list of open and current defects; this is a mass update of old and inactive Bugzilla reports which reflect user error, already resolved defects, and still-existing defects in httpd.

As repeatedly announced, the Apache HTTP Server Project has discontinued all development and patch review of the 2.2.x series of releases. The final release 2.2.34 was published in July 2017, and no further evaluation of bug reports or security risks will be considered or published for 2.2.x releases. All reports older than 2.4.x have been updated to status RESOLVED/LATER; no further action is expected unless the report still applies to a current version of httpd.

If your report represented a question or confusion about how to use an httpd feature, an unexpected server behavior, problems building or installing httpd, or working with an external component (a third party module, browser etc.) we ask you to start by bringing your question to the User Support and Discussion mailing list, see [https://httpd.apache.org/lists.html#http-users] for details. Include a link to this Bugzilla report for completeness with your question.

If your report was clearly a defect in httpd or a feature request, we ask that you retest using a modern httpd release (2.4.33 or later) released in the past year. If it can be reproduced, please reopen this bug and change the Version field above to the httpd version you have reconfirmed with.

Your help in identifying defects or enhancements still applicable to the current httpd server software release is greatly appreciated.