Bug 63855 - Regression in handling 100-continue after PR 60330
Summary: Regression in handling 100-continue after PR 60330
Status: NEW
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: mod_proxy_http (show other bugs)
Version: 2.4.41
Hardware: Sun Solaris
: P2 regression (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-10-16 01:36 UTC by Konstantin J. Chernov
Modified: 2019-10-21 02:41 UTC (History)
0 users



Attachments
Original attempt, where the issue is shown (39.15 KB, text/plain)
2019-10-16 11:23 UTC, Konstantin J. Chernov
Details
Second attempt (full trace7), issue is not reproducing (38.45 KB, text/plain)
2019-10-16 11:24 UTC, Konstantin J. Chernov
Details
Third attempt (same log settings as in ho1.txt) - issue is not reproducing (59.11 KB, text/plain)
2019-10-16 11:24 UTC, Konstantin J. Chernov
Details
Fourth attempt (same settings as in ho1.txt, with ping parameter on BM) -- issue is not reproducing (61.78 KB, text/plain)
2019-10-16 11:40 UTC, Konstantin J. Chernov
Details
Send 100-continue (if needed) when spooling the request body (739 bytes, patch)
2019-10-16 16:47 UTC, Yann Ylavic
Details | Diff
Send 100-continue (if needed) when spooling the request body (v2) (1.01 KB, patch)
2019-10-19 14:21 UTC, Yann Ylavic
Details | Diff
Send 100-continue (if needed) when spooling the request body (v3) (6.40 KB, patch)
2019-10-20 14:41 UTC, Yann Ylavic
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Konstantin J. Chernov 2019-10-16 01:36:59 UTC
Hello.

We are using 64-bit Apache 2.4.41 on Solaris SPARC.
After attempted migration from 2.4.33, we were forced to rollback because of the way 100-continue is handled after PR 60330.

We have apache running as a reverse proxy, with Apache Tomcat as backend.
After receiving a request with "Expect: 100-Continue", when the client is waiting for "100-Continue" before sending any request body, the apache is not sending any request to tomcat, and is not sending a 100-Continue to the client either.

It's very easy to reproduce:
{code}
openssl s_client -connect host:port -cert prv_key.pem -key prv_key.key
POST /service HTTP/1.1
Host: host:port
Content-Type: text/plain
Content-Length: 1
Expect: 100-Continue

{code}

After last \r\n the client is expecting to see "HTTP/1.1 100 Continue", however apache is never sending it to the client.
However, if I enter any character after last \r\n, the reply is correct (the 415 is fine in this case):
{code}
...
Content-Type: text/plain
Expect: 100-Continue
Content-Length: 1

1
HTTP/1.1 100 Continue

HTTP/1.1 415 Unsupported Media Type
Date: Wed, 16 Oct 2019 01:27:37 GMT
Server: Apache-Coyote/1.1
Content-Length: 0
{code}

When I run a snoop (tcpdump), I never see any packets going to the backend until I enter a request body. 

But the problem disappears when I add:
{code}
SetEnv proxy-sendchunks 1
{code}

Can you please help me to figure out why is this happening, and is there any 'correct' ways to handle this bug? Maybe, it's possible to make another environment variable to turn on the 100-continue answers from the apache as it was before PR 60330?

Thank you very much in advance.

Our configuration (relevant to this case):
{code}
HttpProtocolOptions Unsafe LenientMethods Allow0.9

SetEnv proxy-initial-not-pooled 1
SetEnv force-response-1.0 1

SetEnvIf User-Agent ".*MSIE.*" nokeepalive ssl-unclean-shutdown
SetEnvIf Request_Protocol HTTP/1.0$ force-proxy-request-1.0=1

<Proxy balancer://hostPORT>
  BalancerMember http://host1:PORT connectiontimeout=5 timeout=30 retry=30 ping=3 route=1 loadfactor=1
  BalancerMember http://host2:PORT connectiontimeout=5 timeout=30 retry=30 ping=3 route=2 loadfactor=1
</Proxy>

<Location /service>
  ProxyPass balancer://hostPORT/service
  ProxyPassReverse balancer://hostPORT/service
</Location>
{code}
Comment 1 Yann Ylavic 2019-10-16 09:06:48 UTC
Could you please provide the error_log of the transaction with LogLevel trace7?
Comment 2 Konstantin J. Chernov 2019-10-16 11:23:46 UTC
Created attachment 36834 [details]
Original attempt, where the issue is shown
Comment 3 Konstantin J. Chernov 2019-10-16 11:24:07 UTC
Created attachment 36835 [details]
Second attempt (full trace7), issue is not reproducing
Comment 4 Konstantin J. Chernov 2019-10-16 11:24:26 UTC
Created attachment 36836 [details]
Third attempt (same log settings as in ho1.txt) - issue is not reproducing
Comment 5 Konstantin J. Chernov 2019-10-16 11:24:47 UTC
Some magic is definitely happening here.

I was unable to reproduce this issue 12 hours after report. No changes were made to tomcat or apache during this time.

I've added two attachments: ho1.txt were made during original testing, when the issue happened.
It was made with
{code}
  LogLevel ssl:debug dumpio:trace7 proxy:trace7 proxy_balancer:trace7 proxy_http:trace7
  DumpIOInput on
  DumpIOOutput on
{code}

In attempts to reproduce this issue, I had to make an another apache instance with the same configuration, ho2.txt is my attempt to reproduce this issue later with trace7 enabled.

Also, the ho3.txt is the attempt to reproduce the issue with the same settings as I mentioned above (with dumpio turned on). You can clearly see the difference between ho1.txt and ho3.txt.

1. In first attempt:
{code}
[2019-10-16 07:46:01.947019] [proxy:debug] [pid 16948:tid 50] [C:AHHOPGzEn2k] [XaZoSU2Rh-cAhSIGXhDvRgAAAG8] [CLIENT_IP:64606] AH00947: connected /service/endpoint to BACKEND_HOST_2:4005
[2019-10-16 07:46:01.947259] [dumpio:trace7] [pid 16948:tid 50] [C:AHHOPGzEn2k] [-] [CLIENT_IP:64606] mod_dumpio: dumpio_in [readbytes-nonblocking] 466 readbytes
[2019-10-16 07:46:01.947416] [dumpio:trace7] [pid 16948:tid 50] [C:AHHOPGzEn2k] [-] [CLIENT_IP:64606] mod_dumpio: dumpio_in - 11
[2019-10-16 07:46:01.947493] [dumpio:trace7] [pid 16948:tid 50] [C:AHHOPGzEn2k] [-] [CLIENT_IP:64606] mod_dumpio: dumpio_in [readbytes-blocking] 466 readbytes
[2019-10-16 07:47:01.952546] [dumpio:trace7] [pid 16948:tid 50] [C:AHHOPGzEn2k] [-] [CLIENT_IP:64606] mod_dumpio: dumpio_in - 70007
{code}
2. In last attempt:
{code}
[2019-10-16 18:15:58.654483] [proxy:debug] [pid 26661:tid 26] [C:AEBSpTmgr34] [Xab77jMz3Hj958KCYob0FgAAABc] [CLIENT_IP:34060] AH00947: connected /service/endpoint to BACKEND_HOST_2:4005
[2019-10-16 18:15:58.654713] [dumpio:trace7] [pid 26661:tid 26] [C:AEBSpTmgr34] [-] [CLIENT_IP:34060] mod_dumpio: dumpio_in [readbytes-nonblocking] 466 readbytes
[2019-10-16 18:15:58.654864] [dumpio:trace7] [pid 26661:tid 26] [C:AEBSpTmgr34] [-] [CLIENT_IP:34060] mod_dumpio: dumpio_in - 11
[2019-10-16 18:15:58.655007] [proxy:trace2] [pid 26661:tid 26] [C:-] [-] [-] HTTP: fam 2 socket created to connect to BACKEND_HOST_2
...
{code}

I don't understand what made apache to attempt to read in blocking mode earlier.

Beware, that sensitive data has been masked in the logs. Hope it won't make debugging harder.
Comment 6 Konstantin J. Chernov 2019-10-16 11:40:10 UTC
Created attachment 36837 [details]
Fourth attempt (same settings as in ho1.txt, with ping parameter on BM) -- issue is not reproducing

Also adding ho4.txt (the ping parameter is set on BalancerMember).
Behaviour is same as in ho3.txt, but it shows that the absence of "Expect: 100-Continue" in the first request is not the cause.
Comment 7 Yann Ylavic 2019-10-16 15:17:02 UTC
Possibly you had some "SetEnv proxy-interim-response" policy configured on the original attempt but not on the following ones?
Comment 8 Konstantin J. Chernov 2019-10-16 15:46:51 UTC
No, that env var was not set (all setenv's used are in the first comment).
Isn't it supposed to be suppressing the 100-continue in response to the client? In this case, the Apache is not making it's call to the backend at all, while attempting to read the request body.
Comment 9 Yann Ylavic 2019-10-16 16:47:29 UTC
Created attachment 36838 [details]
Send 100-continue (if needed) when spooling the request body

OK, mod_proxy blocks on spooling the request body because of some filter in the input stack (which makes Content-Length forwarding impossible). That's why proxy-sendchunks reolved the issue too, it avoids spolling.
For the first request the body is prefetched so there is no need to spool.

Could you try this patch? It makes so that a "100 continue" interim response is always sent before spooling, if needed.
Comment 10 Konstantin J. Chernov 2019-10-16 17:08:55 UTC
Thanks, I'll try to recompile/install on test env tomorrow (and hopefully get it into prod a week later) to see if anything changes.

I'll also try to reproduce this issue several times during the day on both patched and unpatched versions, hopefully I'll catch it.

But there's one thing I didn't get...
If there's no spoll on the first request, then the case I observed (it was copied from the console) couldn't happen at all.
However, I got following output:
{code}
openssl s_client -connect host:port -cert prv_key.pem -key prv_key.key
POST /service HTTP/1.1
Host: host:port
Content-Type: text/plain
Content-Length: 1
Expect: 100-Continue

<nothing_happens_until_body_entered>1
HTTP/1.1 100 Continue

HTTP/1.1 415 Unsupported Media Type
Date: Wed, 16 Oct 2019 01:27:37 GMT
Server: Apache-Coyote/1.1
Content-Length: 0
{code}
Comment 11 Yann Ylavic 2019-10-16 19:53:28 UTC
I meant the first request in "ho1.txt", not the one from your "openssl s_client".

This first request in "ho1.txt" is sent with both the header and body at the same time (and by the way has no "Expect: 100-continue"). The full body is available to mod_proxy during prefetch, all nonblocking, which means that it can determine/adjust the Content-Length by itself if needed, and forward the request to the backend immediately.

On the other hand, your "openssl s_client" request sends the HTTP header first, and waits before sending the body. In this case, before forwarding the request, mod_proxy will check whether the body could be modified internally, which would change the Content-Length, and if so (any content filter present in the filter chain) it will spool the body to disk to be able to forward the potentially new/correct Content-Length along with the new body. This can be tuned with e.g. proxy-sendchunks which frees mod_proxy from Content-Length constraint (but requires a chunk-compatible backend), and thus allows to forward the header immediately.

Now comes "Expect: 100-continue". For this request you'd expect mod_proxy to not wait for the body since the client wants to wait for a "100 continue" before sending it. But in the spooling case mentioned above, mod_proxy still needs to have the full content to determine the C-L, and thus it must read the body by and for itself. What my patch fixes is that in this case mod_proxy now honors the "Expect: 100-continue" and sends the "100 Continue" response before starting to read/spool (otherwise the client and mod_proxy wait for each other). This is a case where 100-continue cannot be forwarded to (initiated by) the backend, the job must be done on the proxy side (for the client side at least, nothing prevents mod_proxy to have its own 100-continue negotiation with the backend).

Hope this helps..
Comment 12 Konstantin J. Chernov 2019-10-17 03:11:35 UTC
Yann, thank you very much for the detailed explanation and for the patch.

First of all, I finally reproduced the issue, and you were right that it was caused by a per-request content filter.
And I deeply apologize for not mentioning that we are using some of such modules (I never thought they can cause such an issue).

I made a small module which is causing this issue when turned on: https://pastebin.com/3cARnJwm
Also, mod_security is causing this issue as well.

So, the correct steps to reproduce is:
1. Turn on the per-request input filter
2. Send the request with openssl as shown above
3. 100-Continue won't be send until the request body is sent by the client

When I applied your patch, the 100-continue reply was missing the status line:
{code}
POST /service/endpoint HTTP/1.1
Host: host:port
Content-Type: text/xml
Content-Length: 1
Expect: 100-Continue

HTTP/1.1 
{code}

So I modified it like that (I'm not sure whether it is the correct way to handle this, as I was unable to find when the status_line is set during the correct flow):
{code}
    /* Send "100 Continue" if needed before spooling */
    if (req->expecting_100) {
        int saved_status = r->status;
        char* saved_status_line = r->status_line;
        r->status = HTTP_CONTINUE;
        r->expecting_100 = 1;
        r->status_line = ap_get_status_line(r->status);

        ap_send_interim_response(r, 1);

        r->status = saved_status;
        r->status_line = saved_status_line;
        req->expecting_100 = 0;
    }
{code}

After I recompiled mod_proxy_http.so with the changes mentioned above, the issue is gone.
Comment 13 Yann Ylavic 2019-10-19 14:21:39 UTC
Created attachment 36848 [details]
Send 100-continue (if needed) when spooling the request body (v2)

After some discussion on the dev@ mailing list, the right patch would be more like this v2.

Konstantin, could you please try v2 instead in your environment? As a bonus, this one should not require your fix on top of it :p
Comment 14 Konstantin J. Chernov 2019-10-19 15:10:34 UTC
Yann, unfortunately, it's not working :(
And it's not sending 100-Continue at all.

{code}
POST /service/endpoint HTTP/1.1
Host: host:port
Content-Type: text/plain
Content-Length: 1
Expect: 100-Continue

<wait_for_100_continue>1
HTTP/1.1 415 Unsupported Media Type
Date: Sat, 19 Oct 2019 15:04:44 GMT
Server: Apache-Coyote/1.1
Strict-Transport-Security: max-age=63072000; includeSubdomains; preload
Content-Length: 0
{code}

With Proxy100Continue off (I've finally read the new docs and found this magic switch!) and with the previous version of the patch:
{code}
POST /service/endpoint HTTP/1.1
Host: host:port
Content-Type: text/plain
Content-Length: 1
Expect: 100-Continue

HTTP/1.1 100 Continue

1
HTTP/1.1 415 Unsupported Media Type
Date: Sat, 19 Oct 2019 15:08:53 GMT
Server: Apache-Coyote/1.1
Strict-Transport-Security: max-age=63072000; includeSubdomains; preload
Content-Length: 0
{code}
Comment 15 Yann Ylavic 2019-10-20 14:41:47 UTC
Created attachment 36849 [details]
Send 100-continue (if needed) when spooling the request body (v3)

Indeed, v2 does not work because it's too late (in mod_proxy_http's spooling function) to rely on the HTTP input filter to send the "100 Continue". Sorry for the useless testing.

Back to the same fix as with the first patch with this v3, plus the handling of the "100 Continue" status line directly in ap_send_interim_response() (which adds quite some "noise" because it requires to merge ap_get_status_line_ex() which was trunk only so far).
Could you please test it so that I can propose it for 2.4.x?

As for "Proxy100Continue off", it forces mod_proxy to handle the 100-continue expectation immediately, and thus fetch/forward the body on its own regardless of the backend (origin) intent. This may be what you want (since mod_security prevents 100-continue forwarding anyway), but it kind of defeats the purpose of 100-continue IMHO.
Another option is to "SetEnv proxy-sendchunks" as you noticed, but it requires that the backends supports chunk encoding in requests (shouldn't be an issue these days).
Last option (I can think of) would be to convince the mod_security team to register their input filter as a protocol filter (as opposed to a content filter), such that mod_proxy can forward the original Content-Length...
Comment 16 Yann Ylavic 2019-10-20 14:43:06 UTC
(In reply to Yann Ylavic from comment #15)
> Last option (I can think of) would be to convince the mod_security team to
> register their input filter as a protocol filter

They can't do that if their filter can modify the body/content, though.
Comment 17 Konstantin J. Chernov 2019-10-21 02:41:53 UTC
> Could you please test it so that I can propose it for 2.4.x?
Looks like everything is working fine with v3. Thanks a lot! :)

> This may be what you want (since mod_security prevents 100-continue forwarding anyway), but it kind of defeats the purpose of 100-continue IMHO.
Well, we are not using mod_proxy_http on his own, but only in pair with mod_proxy_balancer, so passing 100-continue through seems a bit weird for me, as apache always did his own 100-continue check, choosing another backend in case of failure.
So from my perspective turning the "Proxy100Continue off" in our case is pretty much OK.

> Another option is to "SetEnv proxy-sendchunks" as you noticed, but it requires that the backends supports chunk encoding in requests (shouldn't be an issue these days).
I'd prefer not to use chunked requests/responses at all. It's harder to read/debug (using wireshark, etc.), and we had some issues a few years earlier when the client was sending a plain request over the socket, regexp-parsing the answer, therefore expecting a well-known formatted response with content-length set, and suddenly the response to the client went from "Content-Length: xxx" to chunks.
When we are talking about the browsers as the clients, it might be better/faster/etc., as well as those trendy things like HTTP/2, but when handling M2M requests, the only thing they can give us - is a headache :)