Bug 35292

Summary: ap_lingering_close does not linger up to MAX_SECS_TO_LINGER
Product: Apache httpd-2 Reporter: Gonzalo Paniagua Javier <gonzalo>
Component: CoreAssignee: Apache HTTPD Bugs Mailing List <bugs>
Status: RESOLVED FIXED    
Severity: trivial CC: garethwebbley
Priority: P2 Keywords: PatchAvailable
Version: 2.0-HEAD   
Target Milestone: ---   
Hardware: Other   
OS: All   
Attachments: Patch that fixes the problem.
Second attempt. Patch against http-2.0.x

Description Gonzalo Paniagua Javier 2005-06-09 17:40:24 UTC
ap_lingering_close is supposed to read any leftover data in the connection until
there's no more data or until MAX_SEC_TO_LINGER (30) seconds elapsed.

But turns out that it only performs up to 15 reads of (at most) 512 bytes.
Comment 1 Gonzalo Paniagua Javier 2005-06-09 17:42:28 UTC
Created attachment 15353 [details]
Patch that fixes the problem.

This patch fixes the issue and 2 warnings I got with -Wall on apr_shutdown and
apr_recv not being declared.
Comment 2 Paul Querna 2005-06-09 18:20:46 UTC
apr_shutdown should be part of apr_compat.h.  The submitted patch is against the
2.0.x branch, not trunk.
Comment 3 Joe Orton 2005-06-10 16:27:10 UTC
This was mentioned on http-wg a while ago, indeed.

Does the current behaviour actually cause practical problems?  The change
introduces an extra gettimeofday() call into the normal processing of every
request so it needs good justification.
Comment 4 Gonzalo Paniagua Javier 2005-06-10 18:06:54 UTC
I found this out because I was sending a POST of a few MB to a web service, but
I got the url wrong and saw that there were 15 small reads when closing before i
got the RST when trying to read from the socket.
Comment 5 Joe Orton 2005-06-10 19:52:06 UTC
OK, there are a few alternatives here:

- take the hit on the uncommon path where the read() doesn't get EOF first time
through, and keep calling read()/apr_time_now() until 30 seconds have *really*
passed

pro: equally as safe as 1.3 code
con: more overhead

- bump the tmp buffer size to 8K and lower the read() timeout to 1 second.  this
way the server could eat 30*8K=245K bytes without additional overhead, giving a
significantly better chance of getting the response to the client than just
15*512=~7K

pro: little more overhead than current 2.0 code
con: still less safe than 1.3
Comment 6 Gonzalo Paniagua Javier 2005-06-10 20:32:23 UTC
If there's anything I can say, I'd go for the first alternative and linger for
at most ~32s, while not incurring in any additional penalty for the common path.
Comment 7 Jeff Trawick 2005-06-11 19:38:30 UTC
The reason that such code isn't in there now is to avoid all the syscalls
(retrieving the time).  Some cleverness may allow an implementation that is
willing to wait longer (close to MAX_SECS_TO_LINGER) without retrieving the time
so much.
Comment 8 Gonzalo Paniagua Javier 2005-06-11 20:37:13 UTC
Created attachment 15375 [details]
Second attempt. Patch against http-2.0.x

How about this? There are no calls to time() involved. The first read polls for
up to 30s and subsequent ones only for 0.5s, with a limit on the maximum number
of reads that doubles the maximum read length existing now.
Comment 9 Gonzalo Paniagua Javier 2005-06-11 20:38:15 UTC
Oops. Forgot to increment nread_ops in the loop.
Comment 10 Joe Orton 2005-06-12 13:46:56 UTC
I don't see how that implementation actually fixes the problem.  Or am I missing
something fundamental?

The *problem* is that the server is not reading enough bytes from the socket to
eat up the TCP window and prevent an RST from allowing the client to see the
response.  Just changing the timeouts makes no difference, the solution needs to
actually increase the number of read() calls made (and/or increase the buffer
size passed to read).

In the situation which lingering close is helping, the first read call is *not*
going to time out; the TCP receive buffer for this socket on the server will
already be non-empty so it will necessarily return data immediately.

Changing that first timeout just introduces a new problem; if the client
disappears completely after reading the response, the server will now hang
around for thirty seconds waiting for a FIN that will never arrive, rather than
just for two seconds.
Comment 11 Joe Orton 2005-09-25 22:16:45 UTC
Fixed on trunk and back-ported for 2.1.8 and later.

http://svn.apache.org/viewcvs?rev=291452&view=rev
Comment 12 Joe Orton 2005-09-25 22:23:56 UTC
*** Bug 17722 has been marked as a duplicate of this bug. ***