Bug 63666 - Should take the OS buffers into account when timing lingering
Summary: Should take the OS buffers into account when timing lingering
Status: NEW
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: Platform (show other bugs)
Version: 2.4-HEAD
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
Depends on:
Reported: 2019-08-15 07:50 UTC by Sjoerd Simons
Modified: 2019-08-15 10:24 UTC (History)
0 users

python test case (919 bytes, text/x-python)
2019-08-15 07:50 UTC, Sjoerd Simons

Note You need to log in before you can comment on or make changes to this bug.
Description Sjoerd Simons 2019-08-15 07:50:57 UTC
Created attachment 36718 [details]
python test case

Note version tested is 2.4.41, however the version field doesn't seem to have that one.

For context; we're using bmaptool (https://github.com/intel/bmap-tools) to flasy embedded boards over the network; bmap can on the fly download an image, uncompress it and write to storage (e.g. SD card). As the input image is compressed the amount work bmaptool needs to do fluctutes heavily (e.g. towards the end of an image the content will mosty be zeros, which means for a very amount of small compressed data transfer you get a big amount of compressed data).

What we saw practically happening is on some specific boards/images apache ends up resetting the connection when the data transfer was nearly finished.

Tracing this down what happens is that the connection ens up in FIN-WAIT-1 (iotw. apache has shutdown its write side of the connection already) with quite some amount of data left in the send queue as the connection was stalled at that time, after 30 seconds the connection gets reset.

On the apache site what happens is that it simply finishing writing all its data to the socket, shuts down the write side, lingers for maximally 30 seconds and then closes, which https://svn.apache.org/viewvc?view=revision&revision=1802875 forces a connection reset (on older versions it would "linger"/be "orphaned" on the OS side).

On the network side what happens is that download is stalled (bmaptool is busy) as the recevier window is full, which means that even though apache is already lingering not all data has been transferred and FIN hasn't been sent yet. This is then followed by RST packet as Apache causes the connection to be dropped, with the receiver never having a chance to see all data (or the FIN).

What should probably happen is that when apache does it's lingering it should check the send queue size on the OS side before hard terminating the connection (or leave it up to the OS which is what happened previously) as the connection simply might have slowed down enough to not be able to drain the send queues within 30 seconds...

I've attached a minimal python test case that shows the issue; The key there is to tweak the code a bit the setup such that apache is lingering with a good amount of data left in the send queue when the 40 seconds sleep happens.
Comment 1 Joe Orton 2019-08-15 09:25:59 UTC
Interesting problem.

Is there a portable way to determine the length of the TCP send queue?  Apparently the ioctl TIOCOUTQ might do it for (some?) Unix, tho we've got no experience with using that in APR/httpd.

Even if we can determine that length, I'm not sure what the right logic would be here.  The existence of a non-zero send queue is not sufficient to delay the lingering close, since that's indistinguishable from a DoS which this is supposed to protect against.  Maybe a *decreasing* length send queue would be sufficient, but possibly we'd need some heuristic on how fast it should to decrease to keep the socket open.
Comment 2 Sjoerd Simons 2019-08-15 10:24:42 UTC
Unsure how to get those statistics in a good way. I'm not fully aware of which DOS it's protecting against (I assume leaving orphaned connection open in FIN-WAIT-2?). 

However a DOS that can trigger this with staying in FIN-WAIT-1 with data queued by stalling the download seems equivalent to an attacker stalling the connection at any other time (e.g. half way through the download rather then at the end). Unsure if apache has protection against that, but if so the protection for this corner should probably be equivalent.