Bug 55475 - mod_proxy breaks HTTP chunked transfer coding
Summary: mod_proxy breaks HTTP chunked transfer coding
Status: RESOLVED FIXED
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: mod_proxy (show other bugs)
Version: 2.5-HEAD
Hardware: All All
: P2 critical with 3 votes (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-08-23 18:35 UTC by Hannes Schmidt
Modified: 2014-08-22 12:07 UTC (History)
2 users (show)



Attachments
A sample backend used for the reproduction steps (2.52 KB, text/x-python-script)
2013-08-23 18:35 UTC, Hannes Schmidt
Details
Detect incomplete body in HTTP input filter and return APR_INCOMPLETE (2.2.x) (1.38 KB, patch)
2013-09-13 10:27 UTC, Yann Ylavic
Details | Diff
Detect incomplete body in HTTP input filter and return APR_INCOMPLETE (trunk) (2.30 KB, patch)
2013-09-13 10:32 UTC, Yann Ylavic
Details | Diff
Detect incomplete body in HTTP input filter and return APR_INCOMPLETE (trunk) (2.72 KB, patch)
2013-09-13 12:08 UTC, Yann Ylavic
Details | Diff
Naive attempt at detecting incomplete backend response in mod_proxy (1.77 KB, text/plain)
2013-09-14 01:32 UTC, Hannes Schmidt
Details
Detect incomplete body in HTTP input filter and return APR_INCOMPLETE (2.2.x) (2.75 KB, patch)
2013-09-14 11:16 UTC, Yann Ylavic
Details | Diff
Detect incomplete body in HTTP input filter and return APR_ECONNABORTED (2.2.x) (2.70 KB, patch)
2013-09-16 08:30 UTC, Yann Ylavic
Details | Diff
Detect incomplete body in HTTP input filter and return APR_ECONNABORTED (trunk) (3.45 KB, patch)
2013-09-16 12:37 UTC, Yann Ylavic
Details | Diff
Detect incomplete body in HTTP input filter and return APR_ECONNABORTED (trunk) (1.65 KB, patch)
2013-09-16 13:03 UTC, Yann Ylavic
Details | Diff
Handle 504 like 502 in http error/chunk output filters (trunk) (2.00 KB, patch)
2013-09-16 13:06 UTC, Yann Ylavic
Details | Diff
Detect incomplete body in HTTP input filter and return APR_ECONNABORTED (2.4.x) (3.00 KB, patch)
2013-09-16 13:09 UTC, Yann Ylavic
Details | Diff
Don't return success and an empty brigade in core_input_filter's getline (trunk) (2.08 KB, patch)
2013-09-17 11:43 UTC, Yann Ylavic
Details | Diff
Detect incomplete body in HTTP input filter and return APR_ECONNABORTED (trunk) (4.24 KB, patch)
2013-09-17 12:02 UTC, Yann Ylavic
Details | Diff
Detect incomplete body in HTTP input filter and return APR_ECONNABORTED (trunk) (3.81 KB, patch)
2013-09-18 08:06 UTC, Yann Ylavic
Details | Diff
Don't return success and an empty brigade in core_input_filter's getline/blocking mode (trunk) (1000 bytes, patch)
2013-09-18 08:27 UTC, Yann Ylavic
Details | Diff
Don't return success and an empty brigade in core_input_filter's getline (trunk) (1.00 KB, patch)
2013-09-18 08:32 UTC, Yann Ylavic
Details | Diff
Detect incomplete body in HTTP input filter and return APR_ECONNABORTED (trunk) (3.52 KB, patch)
2013-09-18 11:26 UTC, Yann Ylavic
Details | Diff
Detect incomplete body in HTTP input filter and return APR_ECONNABORTED (2.2.15+RHEL6 patches) (2.67 KB, patch)
2013-09-28 00:56 UTC, Hannes Schmidt
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Hannes Schmidt 2013-08-23 18:35:23 UTC
Created attachment 30757 [details]
A sample backend used for the reproduction steps

Hi,

Apache’s mod_proxy turns incomplete or invalid backend responses into complete ones if the backend uses the chunked transfer coding. This applies to Apache 2.2.x, 2.4.x and the current trunk (2.5.x). If the zero-length chunk that terminates a complete chunked-coding message is missing in the backend response, mod_proxy (or rather the core chunk_filter) adds one. Apache acts as if the backend server sent a complete response, whereas in fact it didn’t. This is very worrying because it prevents clients from reliably detecting errors in the backend.

If the backend doesn’t correctly implement chunked transfer coding, by, say using illegal characters in the size line that precedes every chunk, those violations are ignored by mod_proxy, at least in 2.2.25.

The upcoming RFC by the HTTPbis working group clarifies how clients -- and mod_proxy is a HTTP client -- must handle incomplete messages [1]:

> A client that receives an incomplete response message, which can occur when a 
> connection is closed prematurely or when decoding a supposedly chunked transfer 
> coding fails, must record the message as incomplete.

By turning an incomplete message into a complete one, mod_proxy will be in violation of this section when the draft becomes standard. But that section just describes commonly accepted wisdom: detect and report errors, don’t ignore them, and don’t pretend they didn’t happen.

The situation in 2.4.x and trunk is a little more complicated: there seems to be some non-determinism. One in ten mod_proxy responses do correctly omit the terminating zero-length chunk if the backend response didn’t end in one.

Steps to reproduce (assuming wget, bash, Python, pip and Apache already installed):

$ python -V # should have Python 2.7, not sure if 2.6 works
Python 2.7.5
$ which pip # should have pip, if not see http://www.pip-installer.org/en/latest/installing.html
/usr/local/bin/pip
$ pip install tornado # or use whatever means you prefer to install pip
$ wget 'https://gist.github.com/hannes-ucsc/a8ce89e3ce7967ffa833/raw/69244d94b1d1dd2f0304ecd428f047fe94eb017a/server.py'
$ python server.py 8881

Meanwhile, in another shell:

$ wget 'https://gist.github.com/hannes-ucsc/32df3a1adf6085bdb2cd/raw/13e55b9fc51e2f4486e9b8f4fde4e84f2bc9221a/httpd.conf'
$ sudo cat httpd.conf > /etc/apache2/httpd.conf # or equivalent
$ apachectl restart
$ curl 'http://localhost/cghub/metadata/analysisObject?fake_error=1' && echo success || echo failure

Expected: 

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<ResultSet date="2013-08-10 02:09:40" id="43430736">
...
	</Result>
curl: (18) transfer closed with outstanding read data remaining
failure

The backend sends an incomplete response, so curl should fail.

Actual:

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<ResultSet date="2013-08-10 02:09:40" id="43430736">
...
	</Result>
success

For Apache 2.4.x and trunk we need to use a loop

$ while ! curl 'http://localhost/cghub/metadata/analysisObject?fake_error=1' ; do echo retry ; done

Expected: The loop should never exit.
Actual: At some point the loop exits because curl unexpectedly succeeds.

If we hit the backend directly, we get the expected failure:

$ curl 'http://localhost:8881/cghub/metadata/analysisObject?fake_error=1' && echo success || echo failure
...
failure

[1] http://www.mnot.net/talks/bits-on-the-wire/httpbis.html#incomplete.messages
Comment 1 Hannes Schmidt 2013-09-12 20:40:37 UTC
Does nobody care that proxying of chunked coding is not RFC compliant?

What can we do to bring this to the attention of the right people?
Comment 2 Yann Ylavic 2013-09-13 10:27:46 UTC
Created attachment 30830 [details]
Detect incomplete body in HTTP input filter and return APR_INCOMPLETE (2.2.x)

The problem is that mod_proxy_http stops forwarding backend's data when it gets APR_EOF.

Since it does not know anything about the state of the body it forwards (http_input_filter's BODY_CHUNK/BODY_LENGTH/BODY_NONE), APR_EOF may be a legitimate end-of-body returned by the core_input_filter. Hence it can't do much better at its level.

IMHO, the fix has to be in the http_input_filter.

For an output filter of mine (2.2.x) taking care about the truncated response body, I use the attached patch which does the job , and is quite simple...
Comment 3 Yann Ylavic 2013-09-13 10:32:46 UTC
Created attachment 30831 [details]
Detect incomplete body in HTTP input filter and return APR_INCOMPLETE (trunk)

Same patch for trunk (untested).
Comment 4 Yann Ylavic 2013-09-13 11:02:57 UTC
> If the backend doesn’t correctly implement chunked transfer coding, by,
> say using illegal characters in the size line that precedes every chunk,
> those violations are ignored by mod_proxy, at least in 2.2.25.

This has been fixed in 2.4.x, but not in 2.2.x (and, I guess, is unlikely to be).
Comment 5 Yann Ylavic 2013-09-13 12:08:36 UTC
Created attachment 30832 [details]
Detect incomplete body in HTTP input filter and return APR_INCOMPLETE (trunk)

New patch for trunk, I forgot the EOS case with remaining data.
Comment 6 Hannes Schmidt 2013-09-14 01:32:07 UTC
Created attachment 30837 [details]
Naive attempt at detecting incomplete backend response in mod_proxy

Thank you for looking into this, Yann. I tried both of your patches against their respective branches. They don't fix the issue for me. 

The trunk patch doesn't change the fact that the behaviour is indeterministic. Interestingly, in the cases where I see the expected behaviour (no zero-length chunk sent before connection is closed), there is a slight delay of a few seconds before Apache closes the connection.

I set breakpoints into the new code paths added by your 2.2.x patch and neither of them is ever hit.

A few weeks ago I endeavored to create my own patch against 2.2.x and it does exhibit the desired behavior in our particular use case. However, I am pretty sure it breaks other use cases so I am attaching it for illustrative purposes only.
Comment 7 Yann Ylavic 2013-09-14 11:16:03 UTC
Created attachment 30839 [details]
Detect incomplete body in HTTP input filter and return APR_INCOMPLETE (2.2.x)

My fault, when the filter bails out on error it does not work (my output filter actually catch the error buckets, so it works for me...).

Is this new patch against 2.2.x any better ?
Comment 8 Hannes Schmidt 2013-09-15 06:49:47 UTC
Great! The new 2.2.x patch does fix the issue.

Just to be mean ;-), I did add a new test case, which is to close the connection in the middle of a chunk. In that case mod_proxy/http_filter turns the incomplete chunk into a complete on and then terminates it. For example,

        GET /cghub/metadata/analysisObject?fake_error=1 HTTP/1.1
        Host: 127.0.0.1:8881
        User-Agent: curl/7.24.0 (x86_64-apple-darwin12.0) libcurl/7.24.0 OpenSSL/0.9.8x zlib/1.2.5
        Accept: */*
        X-Forwarded-For: ::1
        X-Forwarded-Host: localhost:8080
        X-Forwarded-Server: 127.0.0.1
        Connection: Keep-Alive

        HTTP/1.1 200 OK
        Date: Sun, 15 Sep 2013 06:36:16 GMT
        Transfer-Encoding: chunked
        Content-Type: text/xml
        Server: TornadoServer/3.1

        1000
        foo

becomes 

        GET /cghub/metadata/analysisObject?fake_error=1 HTTP/1.1
        User-Agent: curl/7.24.0 (x86_64-apple-darwin12.0) libcurl/7.24.0 OpenSSL/0.9.8x zlib/1.2.5
        Host: localhost:8080
        Accept: */*

        HTTP/1.1 200 OK
        Date: Sun, 15 Sep 2013 06:36:16 GMT
        Server: TornadoServer/3.1
        Content-Type: text/xml
        Transfer-Encoding: chunked

        3
        foo
        0
Comment 9 Yann Ylavic 2013-09-16 08:30:03 UTC
Created attachment 30846 [details]
Detect incomplete body in HTTP input filter and return APR_ECONNABORTED (2.2.x)

A new patch that fix that case too.

The problem was the core_input_filter's EOS not stripped from the brigade when returning APR_ECONNABORTED (previously APR_INCOMPLETE, but the associated error message "Partial results are valid but processing is incomplete" looks weird).
Comment 10 Yann Ylavic 2013-09-16 12:37:00 UTC
Created attachment 30847 [details]
Detect incomplete body in HTTP input filter and return APR_ECONNABORTED (trunk)

Same fix for trunk/2.2.x, plus the HTTP_GATEWAY_TIME_OUT handling in ap_http_outerror_filter (ap_http_chunk_filter likewise) or the client connection won't be closed (when asked to keep alive).

Maybe the HTTP_GATEWAY_TIME_OUT handling in ap_http_outerror_filter and ap_http_chunk_filter should be in a separate patch since it is something broken by r1480058 (and still debated here: http://mail-archives.apache.org/mod_mbox/httpd-dev/201305.mbox/%3c518A0304.9050707@apache.org%3e)
Comment 11 Yann Ylavic 2013-09-16 13:03:04 UTC
Created attachment 30848 [details]
Detect incomplete body in HTTP input filter and return APR_ECONNABORTED (trunk)

The trunk patch split in 2 (second one coming next...).
Comment 12 Yann Ylavic 2013-09-16 13:06:31 UTC
Created attachment 30849 [details]
Handle 504 like 502 in http error/chunk output filters (trunk)

Fix 502=>504 side-effect from r1480058.
Comment 13 Yann Ylavic 2013-09-16 13:09:13 UTC
Created attachment 30850 [details]
Detect incomplete body in HTTP input filter and return APR_ECONNABORTED (2.4.x)

Finally the one for 2.4.x (same as 2.2.x, without conflicts).
Comment 14 Hannes Schmidt 2013-09-17 02:34:37 UTC
Brilliant work.

I tested the two error cases (1: incomplete chunk and 2: zero-length chunk missing) with the patches for 2.2.x, 2.4.x and trunk.

2.2.x and 2.4.x work great. Trunk still has a non-determinism if the zero-length chunk is missing at the end. About one in 10 requests unexpectedly succeeds.
Comment 15 Yann Ylavic 2013-09-17 11:43:44 UTC
Created attachment 30852 [details]
Don't return success and an empty brigade in core_input_filter's getline (trunk)

Wow, in fact the core_output_filter returns SUCCESS with an empty brigade in GETLINE mode when the socket is closed, and so does the http input filter...

My understanding of apache is/was that it should never happen (although sanity checks protect this case in several places).

Anyway, the fix has to be in the core input filter (the attached patch), or in the http's one.

My guess is that the first option is better (all the filters should benefit from that, if not broken by :p ).

Otherwise, the http filter's patch follows.
Comment 16 Yann Ylavic 2013-09-17 12:02:18 UTC
Created attachment 30853 [details]
Detect incomplete body in HTTP input filter and return APR_ECONNABORTED (trunk)

The patch to apply for the second option (if the core's patch is not applied).

For (non-)deteminism, it depends on whether the backend connection is closed just after the last chunk is received or later.

If you add a little delay in your python script before closing the connection you should reproduce in any case (without the patch of course).

Personnaly I use netcat as server like :
$ nc -l -p 20580 <<EOF
HTTP/1.1 200 OK
Date: Sun, 15 Sep 2013 06:36:16 GMT
Transfer-Encoding: chunked
Content-Type: text/xml
Server: TornadoServer/3.1

a
1234567890
EOF

or, in the middle of a chunk :
$ nc -l -p 20580 <<EOF
HTTP/1.1 200 OK
Date: Sun, 15 Sep 2013 06:36:16 GMT
Transfer-Encoding: chunked
Content-Type: text/xml
Server: TornadoServer/3.1

100
foo
EOF

To close the connection immediatly I hit Ctrl-C, while the nc's -q0 arg does the job to close the connection immediatly.
Comment 17 Yann Ylavic 2013-09-17 12:04:52 UTC
> To close the connection immediatly I hit Ctrl-C, while the nc's -q0 arg does
> the job to close the connection immediatly.

Well, rather Ctrl-C to close the connection later, and -q0 immediatly :)
Comment 18 Yann Ylavic 2013-09-17 12:26:55 UTC
Just to be precise, the tests (truncated chunked/content-length body) should be OK with attachment 30853 [details] alone or plus attachment 30852 [details], the latter alone won't fix the http filter returning APR_EOF instead of APR_ECONNABORTED.

To make both patches work together, I use the AP_CORE_INPUT_FILTER_GETLINE_EOF macro (defaulting to 1), but this just to show the diffs between the 2 options, and won't be needed in a final patch if one or another (or no) option is choosen.

Any though about this apache team ?
I suppose the core input filter is something that hasn't been changed since a while, a surely lot of things depend on its behaviour, is it an option to fix that here or each code using AP_GETLINE_MODE should be aware of EOF returned as SUCCESS with an empty brigade (and do a non-blocking AP_MODE_SPECULATIVE to be sure) ?
Comment 19 Yann Ylavic 2013-09-18 08:06:10 UTC
Created attachment 30855 [details]
Detect incomplete body in HTTP input filter and return APR_ECONNABORTED (trunk)

A simpler patch for trunk: when the brigade is empty in GETLINE/BLOCKing mode, "come around again".
Comment 20 Yann Ylavic 2013-09-18 08:27:15 UTC
Created attachment 30856 [details]
Don't return success and an empty brigade in core_input_filter's getline/blocking mode (trunk)

The core input filter in GETLINE/NON_BCLOCKing mode should still return SUCCESS with an empty brigade to preserve the existing logic.
Comment 21 Yann Ylavic 2013-09-18 08:32:17 UTC
Created attachment 30857 [details]
Don't return success and an empty brigade in core_input_filter's getline (trunk)

This one is more safe from stray photons.
Comment 22 Yann Ylavic 2013-09-18 11:26:51 UTC
Created attachment 30858 [details]
Detect incomplete body in HTTP input filter and return APR_ECONNABORTED (trunk)

Don't raise an error for EOF in chunks trailers.
Comment 23 Ruediger Pluem 2013-09-18 15:18:57 UTC
Thanks for your work, but I lost a little bit of track :-). Which of the above patches do you currently propose? The last two ones for trunk (from Sept 18th)?
Comment 24 Yann Ylavic 2013-09-18 20:27:55 UTC
Yes, sorry, all this is a bit confusing, let me summarize.

The attachment 30858 [details] alone is enough to solve the current issue, since it will :
1. return APR_ECONNABORTED (instead of APR_EOF) when the body is not complete so that mod_proxy_http handles an error (APR_EOF can't be handled as an error in mod_proxy_http because it can be a legitimate end of a closed/streamed response),
2. retry the ap_get_brigade(AP_MODE_GETLINE, APR_BLOCK_READ) when reading the chunks even if it gets an empty brigade,
3. catch GATEWAY_TIMED_OUT like BAD_GATEWAY in ap_http_outerror_filter and ap_http_chunk_filter to handle the error bucket from mod_proxy_http.

For 1. I first used APR_INCOMPLETE instead of APR_ECONNABORTED, but because of the weird associated error string ("Partial results are valid but processing is incomplete"), I changed to the latter. I'm not sure it is a good idea since APR_ECONNABORTED can be returned from the socket, and now there will be no way to distinguish...

The 2. is needed because ap_core_input_filter returns SUCCESS with an empty brigade in getline+blocking modes when the connection is closed (immediatly). As this seems odd to me, I proposed attachment 30857 [details] to fix the problem at the base so other filters than ap_http_filter don't have to handle this case.

Finally 3. is an issue which must be resolved to address this bug but surely others (and the debate in dev@ is not really closed...), hence I proposed the separate attachment 30849 [details], and finally put it in 3. (forgot about it), let me know if you want me to pull it out of attachment 30858 [details], depending on how you'd likely commit all that (as a whole or separately).
Comment 25 Hannes Schmidt 2013-09-28 00:56:40 UTC
Created attachment 30893 [details]
Detect incomplete body in HTTP input filter and return APR_ECONNABORTED (2.2.15+RHEL6 patches)

Just for the record, I'm attaching a version of Yann's 2.2.x patch, backported to a 2.2.15 source tree with RHEL's long list of patches applied to it.
Comment 26 Hannes Schmidt 2013-10-16 00:06:14 UTC
I noticed that there was some discussion on the mailing list. But that didn't seem go anywhere. What are the next steps to get this committed? Anything needed from me?
Comment 27 Jim Jagielski 2013-11-04 20:56:31 UTC
Yann, Thx for the patches... Ideally, the issue around attachment #2 [details] should be a sep item and discussed on dev@

But the main patch looks good.
Comment 28 Yann Ylavic 2014-04-03 23:22:18 UTC
Fixed in 2.4.8.
Comment 29 Hannes Schmidt 2014-04-05 02:14:12 UTC
This great. Thanks so much, Yann.

Will this also be applied to the 2.2.x branch?
Comment 30 Yann Ylavic 2014-07-15 14:11:40 UTC
(In reply to Hannes Schmidt from comment #29)
> Will this also be applied to the 2.2.x branch?

Backport to 2.2.x proposed.
Will update here if/when approved.
Comment 31 Yann Ylavic 2014-08-22 12:07:18 UTC
Backported in upcoming 2.2.28.