Bug 56729 - Can't use HTTP 1.1 Keep-Alive with mod_reqtimeout if POST request with POST-DATA respond during more than ->RequestReadTimeout body=[timeout]
Summary: Can't use HTTP 1.1 Keep-Alive with mod_reqtimeout if POST request with POST-D...
Status: RESOLVED FIXED
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: mod_reqtimeout (show other bugs)
Version: 2.4.10
Hardware: PC Linux
: P2 regression (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords: FixedInTrunk
Depends on:
Blocks:
 
Reported: 2014-07-16 12:14 UTC by Philippe Doussot
Modified: 2015-07-16 07:34 UTC (History)
3 users (show)



Attachments
testcase_jmeter_cgi.tgz (904.69 KB, application/octet-stream)
2014-07-16 12:23 UTC, Philippe Doussot
Details
testcase_curl_jsp.tgz (694 bytes, application/octet-stream)
2014-07-16 12:33 UTC, Philippe Doussot
Details
Proposed bugfix: do not reset the "in_keep_alive" flag for speculative reads (1.17 KB, patch)
2015-06-30 11:28 UTC, Michael Kaufmann
Details | Diff
Proposed bugfix: do not reset the "in_keep_alive" flag for speculative reads - v2 (589 bytes, patch)
2015-06-30 21:09 UTC, Michael Kaufmann
Details | Diff
Proposed bugfix: do not reset the "in_keep_alive" flag for speculative reads - v3 (548 bytes, patch)
2015-06-30 21:32 UTC, Michael Kaufmann
Details | Diff
Backtraces (reqtimeout_after_body) (3.34 KB, text/plain)
2015-07-01 11:58 UTC, Michael Kaufmann
Details
Full backtraces (reqtimeout_after_body) (8.35 KB, text/plain)
2015-07-01 12:21 UTC, Michael Kaufmann
Details
Set the in_keep_alive state in pre_read_request hook (1.87 KB, patch)
2015-07-01 18:07 UTC, Yann Ylavic
Details | Diff
Full backtrace (reqtimeout_filter) (2.87 KB, text/plain)
2015-07-02 07:30 UTC, Michael Kaufmann
Details
Example module that triggers the bug (mod_bugtest) (899 bytes, application/pgp-signature)
2015-07-02 07:31 UTC, Michael Kaufmann
Details
Apache configuration for the example module (500 bytes, text/plain)
2015-07-02 07:31 UTC, Michael Kaufmann
Details
Use reqtimeout_before_headers instead of reqtimeout_after_body for the HTTP header state (4.39 KB, patch)
2015-07-02 15:41 UTC, Yann Ylavic
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Philippe Doussot 2014-07-16 12:14:31 UTC
Keep-Alive broken by mod_reqtimeout since 2.2.20 :
HTTP 1.1 GET  : works fine
HTTP 1.1 POST without POST-DATA : works fine
HTTP 1.1 POST with POST-DATA : is broken

As the variable is named, **Request**ReadTimeout ... they must not timeout slow **Response** body.

exemple :
Slow generated csv text file data during more than 15 sec with default(ubuntu/debian) RequestReadTimeout body=15
or slow download during more than body=[duration]
or big file download with HTTP POST with post-data during more than timeout


The doc says well that the timeout apply to : receiving **request** headers and body **from client**
http://httpd.apache.org/docs/2.2/mod/mod_reqtimeout.html

During download of the response (header and body(post-data) from the request is already received by apache) , the "timer" of mod reqtimeout continue to increase and at the end of the response the tcp connection is closed ( TCP FIN ).

While keep-alive is in use it close the connexion while another http request is already in transit.

The network connection is expected to survive for later reuse during 30 more seconds ( or 100 hit ) because of Client and Server allow Keep-Alive
and apache respond : Connexion: Keep-Alive


Client Header sent :
    Connection: keep-alive

Server response header from the first request  :
    Keep-Alive: timeout=30, max=100
    Connection: Keep-Alive

But connexion is closee by tcp FIN at then end of the first request.

Slow "download" (from the client point of view) is not slow **Request** sending header or slow sending body like Slowloris ..
Big file download at high speed during more than 15 seconds should not break keep-alive mecanism.

Test case whith mod_proxy and slow jsp page in backend : One POST without reconnect worke fine ( keep-alive used ( max=X decrease  ))

doussot@PC-TEC-12:~$ export URL="http://hostname/page.jsp" ; curl -i -X POST -H "Connection: keep-alive" -w @curl-format_post.txt "$URL" -o /tmp/output1 "$URL" -o /tmp/output2 ; head -n 10 /tmp/output1 ; head -n 10 /tmp/output2
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  195k    0  195k    0     0   9957      0 --:--:--  0:00:20 --:--:--  9693
0,123: time_namelookup
0,164: time_connect
0,164: time_pretransfer
0,218: time_starttransfer
----------
20,087: time_total

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  195k    0  195k    0     0  10045      0 --:--:--  0:00:19 --:--:--  9693
0,000: time_namelookup
0,000: time_connect
0,000: time_pretransfer
0,056: time_starttransfer
----------
19,911: time_total

HTTP/1.1 200 OK
Date: Fri, 11 Jul 2014 12:45:57 GMT
Server: Sun GlassFish Enterprise Server v2.1
X-Powered-By: JSP/2.1
Content-Type: text/html;charset=ISO-8859-1
Set-Cookie: JSESSIONID=............................................; Path=/
Keep-Alive: timeout=30, max=100
Connection: Keep-Alive
Transfer-Encoding: chunked

HTTP/1.1 200 OK
Date: Fri, 11 Jul 2014 12:46:17 GMT
Server: Sun GlassFish Enterprise Server v2.1
X-Powered-By: JSP/2.1
Content-Type: text/html;charset=ISO-8859-1
Set-Cookie: JSESSIONID=............................................; Path=/
Keep-Alive: timeout=30, max=99
Connection: Keep-Alive
Transfer-Encoding: chunked



The second request dont made connect :  time_connect 0,000 .. fine


If we add POST data : curl --data "test=1"


doussot@PC-TEC-12:~$ export URL="http://hostname/page.jsp" ; curl -i --data "test=1" -X POST -H "Connection: keep-alive" -w @curl-format_post.txt "$URL" -o /tmp/output1 "$URL" -o /tmp/output2 ; head -n 10 /tmp/output1 ; head -n 10 /tmp/output2
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  195k    0  195k    0     6   9949      0 --:--:--  0:00:20 --:--:--  9688
0,130: time_namelookup
0,170: time_connect
0,170: time_pretransfer
0,226: time_starttransfer
----------
20,102: time_total

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  195k    0  195k    0     6  10009      0 --:--:--  0:00:19 --:--:--  9682
0,000: time_namelookup
0,044: time_connect
0,044: time_pretransfer
0,100: time_starttransfer
----------
19,982: time_total

HTTP/1.1 200 OK
Date: Fri, 11 Jul 2014 12:47:04 GMT
Server: Sun GlassFish Enterprise Server v2.1
X-Powered-By: JSP/2.1
Content-Type: text/html;charset=ISO-8859-1
Set-Cookie: JSESSIONID=............................................ Path=/
Keep-Alive: timeout=30, max=100
Connection: Keep-Alive
Transfer-Encoding: chunked

HTTP/1.1 200 OK
Date: Fri, 11 Jul 2014 12:47:24 GMT
Server: Sun GlassFish Enterprise Server v2.1
X-Powered-By: JSP/2.1
Content-Type: text/html;charset=ISO-8859-1
Set-Cookie: JSESSIONID=............................................; Path=/
Keep-Alive: timeout=30, max=100
Connection: Keep-Alive
Transfer-Encoding: chunked


2x time_connect != 0
A new connection is made beaucause of the drop of the first by apache2.2
Keep-Alive max reset to 100 for the second request




Tested on :

Debian Squeeze
    apache2.2-bin -> 2.2.16-6+squeeze12    : OK

Ubuntu precise 12.04LTS
    apache2.2-bin -> 2.2.22-1ubuntu1.5     : Broken

Manual build : 2.2.19 : OK
Manual build : 2.2.20 : Broken
Manual build : latest stable : 2.4.9  : Broken

With LogLevel  debug: we can see that the network connexion is reset :( 2.4.9 )
[Wed Jul 16 12:24:39.433978 2014] [core:error] [pid 23190:tid 140066114492160] (104)Connection reset by peer: [client 127.0.0.1:60110] AH00574: ap_content_length_filter: apr_bucket_read() failed


Related to :
https://issues.apache.org/bugzilla/show_bug.cgi?id=51103

http://svn.apache.org/viewvc?view=revision&revision=1162862
http://svn.apache.org/viewvc?view=revision&revision=1103213


Reproducible with différent client http: curl 7.22/jmeter 2.11
Reproducible différent content to download : generated by /bin/bash cgi / jsp / big static file

test case available for jsp cgi jmeter curl to follow
Comment 1 Philippe Doussot 2014-07-16 12:23:06 UTC
Created attachment 31817 [details]
testcase_jmeter_cgi.tgz

testcase jmeter + cgi ( bin/bash ) + result jmeter + network capture wireshark
Comment 2 Philippe Doussot 2014-07-16 12:33:59 UTC
Created attachment 31818 [details]
testcase_curl_jsp.tgz

Test case curl + jsp to generate slow content
Comment 3 Philippe Doussot 2014-08-22 07:59:13 UTC
Bug still present in 2.4.10
Comment 4 Philippe Doussot 2014-08-22 08:12:11 UTC
(In reply to Philippe Doussot from comment #3)
> Bug still present in 2.4.10

Default MPM :
Server Version: Apache/2.4.10 (Unix)
Server MPM: event
Comment 5 Philippe Doussot 2014-08-25 07:12:05 UTC
Maybe related to :
https://issues.apache.org/bugzilla/show_bug.cgi?id=44782

I will check more today and post feedback.
Comment 6 Eric Covener 2014-08-25 16:38:55 UTC
Was able to reproduce, investigating
Comment 7 Eric Covener 2014-08-25 16:56:51 UTC
The failure is during an AP_MODE_SPECULATIVE read during check_pipeline()
Comment 8 Eric Covener 2014-08-30 13:01:07 UTC
Thanks again for the report, fixed in trunk http://svn.apache.org/r1621453 and will propose for backport.
Comment 9 Philippe Doussot 2014-11-03 10:57:31 UTC
Thanks Eric for your quick patch.

Since your patch in trunk on 2014-08-30
the next build in branch 2.2 was out  : http://www.apache.org/dist/httpd/CHANGES_2.2.29
without this bug corrected.

Is the backport of this bug delayed or refused in 2.2 branch ?

Must-I absolutely upgrade to 2.4 to get this work well or just wait next 2.2 release ?
Comment 10 Eric Covener 2014-11-03 14:04:50 UTC
proposed for 22x, patch applies with minor conflict

http://people.apache.org/~covener/patches/httpd-2.2.2-reqtimeout-slow-post.diff
Comment 11 Michael Kaufmann 2015-06-30 11:28:22 UTC
Created attachment 32863 [details]
Proposed bugfix: do not reset the "in_keep_alive" flag for speculative reads

I think that the bugfix ( http://svn.apache.org/r1621453 ) is flawed. I use a module that also issues speculative reads. mod_reqtimeout detects the end of the keep-alive phase too early.

Please have a look at the proposed patch (mod_reqtimeout.c.diff). It resets the "in_keep_alive" flag only if the read is not speculative.
Comment 12 Yann Ylavic 2015-06-30 12:12:32 UTC
The patch looks correct, but I'm not not sure you can safely call the filter stack when the connection is in keepalive state.

With the event MPM the connection may be processed at any time, and with other MPMs there is probably one thread already blocked in the ap_get_brigade() call.

I don't know where your module is hooked though.
Comment 13 Michael Kaufmann 2015-06-30 15:53:27 UTC
My module is a content generator (uses ap_hook_handler). But I have discovered that the speculative reads are triggered by check_pipeline(), not by my module - sorry for the confusion...

I have debugged the problem. Sometimes check_pipeline() is called *before* the connection enters the keep-alive state (and also before the log_transaction handler that mod_reqtimeout uses to detect the end of the request). In this case, it works. The branch "if (block == APR_NONBLOCK_READ && mode == AP_MODE_SPECULATIVE) ..." is taken.

But sometimes check_pipeline() is called *after* the connection has already entered the keep-alive state. Then the branch "if (ccfg->in_keep_alive) ..." is taken, and this branch does not check for speculative reads - that's the bug.

I don't know exactly how these two cases are triggered.

It would be great if this bugfix and the previous bugfixes r1621453 / r1641376 would be included in a future 2.4.x release.
Comment 14 Michael Kaufmann 2015-06-30 21:09:28 UTC
Created attachment 32866 [details]
Proposed bugfix: do not reset the "in_keep_alive" flag for speculative reads - v2

I have attached a better (less intrusive) patch: mod_reqtimeout.c-v2.diff
Comment 15 Michael Kaufmann 2015-06-30 21:32:59 UTC
Created attachment 32867 [details]
Proposed bugfix: do not reset the "in_keep_alive" flag for speculative reads - v3
Comment 16 Yann Ylavic 2015-06-30 22:22:34 UTC
Thanks Michael for the analyse and patches!

(In reply to Michael Kaufmann from comment #13)
> 
> But sometimes check_pipeline() is called *after* the connection has already
> entered the keep-alive state. Then the branch "if (ccfg->in_keep_alive) ..."
> is taken, and this branch does not check for speculative reads - that's the
> bug.
> 
> I don't know exactly how these two cases are triggered.

Could you please provide a backtrace (gdb's "bt full"?) of this situation?

> 
> It would be great if this bugfix and the previous bugfixes r1621453 /
> r1641376 would be included in a future 2.4.x release.

I'll propose a backport for the whole, but first the above trace would help determine the root cause (I'm not sure whether this should happen).
Comment 17 Michael Kaufmann 2015-07-01 11:58:15 UTC
Created attachment 32872 [details]
Backtraces (reqtimeout_after_body)

The backtraces for reqtimeout_filter look the same, so I have created backtraces for reqtimeout_after_body (attached).

I have used Apache 2.4.10, a patched mod_reqtimeout (including r1621453 + r1641376 from trunk) and the worker MPM.
Comment 18 Michael Kaufmann 2015-07-01 12:21:45 UTC
Created attachment 32873 [details]
Full backtraces (reqtimeout_after_body)
Comment 19 Yann Ylavic 2015-07-01 18:07:37 UTC
Created attachment 32874 [details]
Set the in_keep_alive state in pre_read_request hook

Does this patch solve the issue?

Otherwise, could you please, provide the backtrace from reqtimeout_filter() for the two cases (so to determine where the "faulty" call to the filter comes from)?

Thanks!
Comment 20 Michael Kaufmann 2015-07-02 07:29:36 UTC
Unfortunately your patch does not work with the event MPM (but yes, it fixes the bug for the worker MPM).

I will attach the backtrace. I have also created an example module that you can use to reproduce the bug.
Comment 21 Michael Kaufmann 2015-07-02 07:30:09 UTC
Created attachment 32875 [details]
Full backtrace (reqtimeout_filter)
Comment 22 Michael Kaufmann 2015-07-02 07:31:01 UTC
Created attachment 32876 [details]
Example module that triggers the bug (mod_bugtest)
Comment 23 Michael Kaufmann 2015-07-02 07:31:31 UTC
Created attachment 32877 [details]
Apache configuration for the example module
Comment 24 Michael Kaufmann 2015-07-02 07:33:18 UTC
With the example module & the worker MPM, it's possible to trigger the bug with this request:

$ time telnet localhost 8080
Trying ::1...
Connected to localhost.
Escape character is '^]'.
GET / HTTP/1.1
Host: localhost
Connection: Keep-Alive

HTTP/1.1 200 OK
Date: Thu, 02 Jul 2015 07:25:33 GMT
Server: Apache/2.4.12 (Unix)
Content-Length: 4
Keep-Alive: timeout=20, max=100
Connection: Keep-Alive

TestConnection closed by foreign host.

real	0m10.661s
user	0m0.000s
sys	0m0.002s
Comment 25 Yann Ylavic 2015-07-02 15:41:22 UTC
Created attachment 32879 [details]
Use reqtimeout_before_headers instead of reqtimeout_after_body for the HTTP header state

How about this version?

Otherwise, could you define "does not work with the event MPM"?
Comment 26 Michael Kaufmann 2015-07-02 19:10:46 UTC
> How about this version?

Probably does not work too (not tested) because it has the same problem as the previous patch (see below)


> Otherwise, could you define "does not work with the event MPM"?

The worker MPM calls the hook "pre_read_request" when a request has been processed, *before* the keep-alive state. The event MPM calls this hook when the next request has been detected, *after* the keep-alive state.

Example:

Detect incoming data
Process request 1
Finish request 1
--> worker calls the pre_read_request hooks here
... wait for next request (keep-alive state) ...
Detect incoming data
--> event calls the pre_read_request hooks here
Process request 2
Finish request 2

So with the event MPM, the pre_read_request hook cannot be used to set ccfg->in_keep_alive because it's too late.

May I remind you that I have proposed a working patch? ;-)
Comment 27 Yann Ylavic 2015-07-02 23:59:20 UTC
(In reply to Michael Kaufmann from comment #26)
> 
> Probably does not work too (not tested) because it has the same problem as
> the previous patch (see below)

You possibly should test it...

> 
> The worker MPM calls the hook "pre_read_request" when a request has been
> processed, *before* the keep-alive state. The event MPM calls this hook when
> the next request has been detected, *after* the keep-alive state.

I think we need to agree on what the keepalive state is.
From mod_reqtimeout POV, this is when the time taken waiting for the next request should not be taken into account, which happens to be when called from read_request_line().

This report is about subsequent requests possibly timedout by the module because the keep-alive time becomes accounted after some speculative read (from check_pipeline) resets the state.

After r1621453 and r1641376 (which BTW I assume you applied already), the new issue is that the log_transaction hook may happen either before or after the subsequent request is processed, depending on whether the previous request has already been destroyed or not (e.g. pipelined responses).

In the latter case, the keepalive state may not be set (or too late, or twice, ..., that's a race).

With the latest patch, read_request_line() is called just after the pre_read_request hook (which indeed happens after the previous request is processed, but mainly *before* the one mod_reqtimeout's state is for).

The worker MPM calls read_request_line() directly to read the next request, so it will pass the keep-alive timeout time in reqtimeout_filter(), we must not account it, nor depend on the previous request to be destroyed.

The event MPM handles the kept-alive connections by itself (polling the socket directly, without using the input filter stack), consequently when read_request_line() is then called, the current (as per the latest patch) request is indeed already available, yet we can't depend on the previous request to update the state.

That's what the latest patch addresses, the state is initialized in the pre_read_request hook ([be]for[e] the new request) by reqtimeout_before_headers(), thus the racy log_transaction hook reqtimeout_after_body() is not needed anymore.

The patch now also account for the request line read itself (after the wait), whereas the original code simply forwarded the result with no timeout check (subsequent requests only).

Hopefully it would solve your issue since now I can't see how the state could be wrong (from moq_reqt POV...).

> 
> Example:
> 
> Detect incoming data
> Process request 1
> Finish request 1
worker and event could have called ap_run_log_transaction() here, or not...
> --> worker calls the pre_read_request hooks here
> ... wait for next request (keep-alive state) ...
worker::ap_read_request()::ap_run_pre_read_request()::reqtimeout_before_headers()
worker::process_socket()::ap_process_connection():..:ap_read_request()::read_request_line()::reqtimeout_filter()
Whereas:
event::[listen]::poll()
> Detect incoming data
> --> event calls the pre_read_request hooks here
Yes, with:
event::[work]::ap_run_process_connection():..:ap_read_request()::ap_run_pre_read_request()::reqtimeout_before_headers()
And then:
event::[work]::ap_run_process_connection():..:ap_read_request()::read_request_line()::reqtimeout_filter()
event::[work]::ap_run_process_connection():..:ap_read_request()::ap_read_mime_headers()::reqtimeout_filter()*
Whereas worker is already here:
worker::process_socket()::ap_process_connection():..:ap_read_request()::ap_read_mime_headers()::reqtimeout_filter()*
> Process request 2
> Finish request 2
> 
> So with the event MPM, the pre_read_request hook cannot be used to set
> ccfg->in_keep_alive because it's too late.

That's just before the very first read about the upcoming request on this kept-alive connection, we must reset the state here, and set in_keep_alive (which really is the state of the connection until we read something by the way, and waiting for this read shall not be accounted).

> 
> May I remind you that I have proposed a working patch? ;-)

It can't be correct, as any concurrent read (be it speculative) during the keep-alive state.
Comment 28 Michael Kaufmann 2015-07-03 18:05:38 UTC
> You possibly should test it...
Yes, you are right. I have tested the latest patch and it indeed fixes this bug! Sorry that I have made wrong assumptions.

> After r1621453 and r1641376 (which BTW I assume you applied already)
That's correct, I have tested including these two changes.

> thus the racy log_transaction hook reqtimeout_after_body() is not needed anymore.
Good refactoring! (minor issue: the old hook name "log_transaction" is still mentioned in a comment)

> I can't see how the state could be wrong (from moq_reqt POV...).
Maybe some future or custom MPM does nonblocking+speculative reads in the keep-alive state. So perhaps the corresponding "if" block that handles this case should be moved to the top of the function. But this is probably more a theoretical issue.

Yann, I really appreciate the time and effort that you have put into fixing this bug. Keep up the good work!
Comment 29 Yann Ylavic 2015-07-06 08:25:00 UTC
Thanks Michael for the feedbacks and testing on this issue.

Committed in r1689325 and backport proposed to 2.4.x (including r1621453 and r1641376).
Comment 30 Yann Ylavic 2015-07-16 07:34:32 UTC
Fixed in 2.4.16 (r1689922) and 2.2.30 (r1678698).