Bug 60956 - Event MPM listener thread may get blocked by SSL shutdowns
Summary: Event MPM listener thread may get blocked by SSL shutdowns
Status: RESOLVED FIXED
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: mpm_event (show other bugs)
Version: 2.4.23
Hardware: PC Linux
: P2 major (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords: FixedInTrunk
Depends on:
Blocks:
 
Reported: 2017-04-05 13:20 UTC by Frank Meier
Modified: 2017-09-22 10:57 UTC (History)
5 users (show)



Attachments
module generating data without flushing to trigger write completion feature (1.95 KB, text/x-csrc)
2017-05-09 09:25 UTC, Frank Meier
Details
special https client (4.56 KB, text/x-c++src)
2017-05-09 09:27 UTC, Frank Meier
Details
simple httpd config (967 bytes, text/plain)
2017-05-09 09:27 UTC, Frank Meier
Details
Defer nonblocking lingering close to workers (11.98 KB, patch)
2017-07-21 00:46 UTC, Yann Ylavic
Details | Diff
Defer nonblocking lingering close to workers (v3) (14.54 KB, patch)
2017-07-21 16:56 UTC, Yann Ylavic
Details | Diff
Defer nonblocking lingering close to workers (v4) (14.66 KB, patch)
2017-07-22 00:13 UTC, Yann Ylavic
Details | Diff
Defer nonblocking lingering close to workers (v5) (14.51 KB, patch)
2017-07-22 11:30 UTC, Yann Ylavic
Details | Diff
Backport of r1802875 to 2.4.x (16.87 KB, patch)
2017-09-18 09:17 UTC, Yann Ylavic
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Frank Meier 2017-04-05 13:20:56 UTC
I have analyzed an Apache httpd 2.4.23 server that did not handle new connections anymore. I found this stack trace:

#0  0x00007f996d44f283 in poll () from /lib64/libc.so.6
#1  0x00007f996df8764f in apr_poll () from /opt/apache/bin/libapr-1.so.0
#2  0x00007f996eacb485 in ap_core_output_filter ()
#3  0x00007f996cf46488 in bio_filter_out_pass () from /opt/apache/bin/mod_ssl.so
#4  0x00007f996cf483bf in bio_filter_out_ctrl () from /opt/apache/bin/mod_ssl.so
#5  0x00007f996cf5803b in modssl_smart_shutdown () from /opt/apache/bin/mod_ssl.so
#6  0x00007f996cf4856e in ssl_filter_io_shutdown.isra.2 () from /opt/apache/bin/mod_ssl.so
#7  0x00007f996cf49c10 in ssl_io_filter_output () from /opt/apache/bin/mod_ssl.so
#8  0x00007f996cf46b4e in ssl_io_filter_coalesce () from /opt/apache/bin/mod_ssl.so
#9  0x00007f996ead9f93 in ap_shutdown_conn ()
#10 0x00007f996a505702 in start_lingering_close_nonblocking () from /opt/apache/bin/mod_mpm_event.so
#11 0x00007f996a5040ac in process_timeout_queue () from /opt/apache/bin/mod_mpm_event.so
#12 0x00007f996a5063b0 in listener_thread () from /opt/apache/bin/mod_mpm_event.so
#13 0x00007f996d90faa1 in start_thread () from /lib64/libpthread.so.0
#14 0x00007f996d458aad in clone () from /lib64/libc.so.6


The function start_lingering_close_nonblocking() is blocked by a call to poll(), which must not happen. Because the listener thread is blocked, this process does not accept new connections anymore.

The line numbers are missing in the stack, but I think this happens:
- ap_shutdown_conn() creates an "End Of Connection" bucket (EOC)
- mod_ssl detects this in ssl_io_filter_output() and calls modssl_smart_shutdown()
- modssl_smart_shutdown() sends an SSL "close notify" shutdown alert to the peer and then flushes the data - this may block

The clean SSL shutdown has been implemented in Apache httpd 2.4.12 (see bug 54998). Previous versions of Apache httpd 2.4 are not affected.
Comment 1 Frank Meier 2017-05-09 09:18:27 UTC
I finally was able to reproduce the phenomenon. It is occurs if a request handler triggers the asynchronous write completion feature, which gives the listener thread the opportunity to send the final bytes of a response to the client asynchronously, without blocking a worker thread. But if the client refuses to read the data, the connection gets stalled (TCP Window FULL message in wireshark). This does not block the listener thread, since it does the writing asynchronously, and a stalled connection is not a problem. But then, after the timeout ([1] default 60s), the listener thread wants to close the connection and triggers start_lingering_close_nonblocking() and the listener thread gets blocked as described above. After another timeout interval [1] the listener thread recovers from it's misery.

The tricky part to reproduce this, is to get the right amout of data locked in the TCP pipeline (receive buffer of the client, and the send buffer of the server).
1) If the client blocks to early, the pipeline fills up, but if the module has more than 64k of data to send, the asynchronous write completion feature is not triggered. 
2) If the client blocks to late, there is enough "space" in the TCP pipeline to accommodate all the remaining bytes including the SSL shutdown alert, in which case the start_lingering_close_nonblocking() function does not block.

I've written some test code to simplify the reproducibility:
* a httpd module (mod_gendata) which generates a given amount of body data, where the last 60k are not flushed, this should trigger the asynchronous write completion in the listener thread.
* a special HTTPS client, that reads a given amount of data from the server and then stops reading completely.
* a httpd.conf file that only starts one single httpd process with 2 worker threads, that makes it easy to show whats happening if we look at the stack of the process.

On my test system the TCP pipeline was full at around ~800k. So I request 850k of data, read ~1000 bytes so the headers are received and then stop receiving. I see, that the write completion was triggered if all the worker threads are idle (check with gstack). And I see if the TCP pipeline is full, when the TCP connection does *not* enter the FIN_WAIT1 state after the configured KeepAliveTimout [2] (check with netstat). If both conditions are met, the listener thread calls start_lingering_close_nonblocking() after 60s, and blocks. It may take some tries to figure out the right amount of data that has to be requested to get it right.


[1] https://httpd.apache.org/docs/2.4/mod/core.html#timeout
[2] https://httpd.apache.org/docs/2.4/mod/core.html#keepalivetimeout
Comment 2 Frank Meier 2017-05-09 09:25:22 UTC
Created attachment 34986 [details]
module generating data without flushing to trigger write completion feature
Comment 3 Frank Meier 2017-05-09 09:27:25 UTC
Created attachment 34987 [details]
special https client
Comment 4 Frank Meier 2017-05-09 09:27:46 UTC
Created attachment 34988 [details]
simple httpd config
Comment 5 Frank Meier 2017-05-09 09:39:17 UTC
I've attached my test code and httpd.conf file, if anyone likes to reproduce the issue.

Steps to reproduce:
* start httpd with attached mod_gendata and httpd.conf

* start a shell script loop showing the stacks of the httpd process and its TCP connections:
 -----------------------------
 $ while true; do gstack "${HTTPD_PID}"; netstat -ntp | grep 127.0.0.1:10443; echo ----------; sleep 1; done
 -----------------------------

* start the client with the right amount of data to trigger write completion and fill up the TCP pipeline
 -----------------------------
 $ ./openssl-test 127.0.0.1 10443 '/gendata/?nBytes=850000' 0 1000
 -----------------------------

* after 60s you should see a output like this:
----------------
Thread 4 (Thread 0x7fde1d151700 (LWP 6960)):
#0  0x00007fde1fb4e4ed in poll () from /lib64/libc.so.6
#1  0x00007fde20890dc3 in poll (__timeout=<optimized out>, __nfds=<optimized out>, __fds=0x7fde1d150a90) at /usr/include/bits/poll2.h:46
#2  apr_poll (aprset=aprset@entry=0x7fde1d150b30, num=num@entry=1, nsds=nsds@entry=0x7fde1d150b24, timeout=<optimized out>) at poll/unix/poll.c:120
#3  0x000055780168d8f5 in send_brigade_blocking (c=0x7fde180077b8, bytes_written=0x7fde100057e8, bb=0x7fde18007c60, s=0x7fde18007520) at core_filters.c:747
#4  ap_core_output_filter (f=0x7fde100056d8, new_bb=0x7fde18007c60) at core_filters.c:542
#5  0x00007fde1f639648 in bio_filter_out_pass (outctx=0x7fde18007c40) at ssl_engine_io.c:139
#6  0x00007fde1f63b558 in bio_filter_out_flush (bio=<optimized out>) at ssl_engine_io.c:160
#7  0x00007fde1f63b58f in bio_filter_out_ctrl (bio=<optimized out>, cmd=<optimized out>, num=<optimized out>, ptr=<optimized out>) at ssl_engine_io.c:266
#8  0x00007fde1f64b14b in modssl_smart_shutdown (ssl=ssl@entry=0x7fde1000cfe0) at ssl_util_ssl.c:145
#9  0x00007fde1f63b741 in ssl_filter_io_shutdown (c=0x7fde180077b8, abortive=abortive@entry=0, filter_ctx=0x7fde18007bc0) at ssl_engine_io.c:1023
#10 0x00007fde1f63cdf1 in ssl_io_filter_output (f=0x7fde18007c18, bb=0x7fde10006110) at ssl_engine_io.c:1691
#11 0x00007fde1f639d6a in ssl_io_filter_coalesce (f=0x7fde18007bf0, bb=0x7fde10006110) at ssl_engine_io.c:1648
#12 0x000055780169c3d3 in ap_shutdown_conn (c=c@entry=0x7fde180077b8, flush=flush@entry=0) at connection.c:88
#13 0x00007fde1f866d62 in start_lingering_close_nonblocking (cs=0x7fde18007728) at event.c:910
#14 0x00007fde1f86618c in process_timeout_queue (q=0x55780311ae98, timeout_time=timeout_time@entry=1494322580641883, func=func@entry=0x7fde1f866d10 <start_lingering_close_nonblocking>) at event.c:1509
#15 0x00007fde1f86851f in listener_thread (thd=0x5578030c9e08, dummy=<optimized out>) at event.c:1834
#16 0x00007fde20014444 in start_thread () from /lib64/libpthread.so.0
#17 0x00007fde1fb575ed in clone () from /lib64/libc.so.6
Thread 3 (Thread 0x7fde1d952700 (LWP 6959)):
#0  0x00007fde2001a02f in pthread_cond_wait () from /lib64/libpthread.so.0
#1  0x00007fde208887cd in apr_thread_cond_wait (cond=<optimized out>, mutex=<optimized out>) at locks/unix/thread_cond.c:68
#2  0x00007fde1f86b005 in ap_queue_pop_something (queue=0x5578030c99a0, sd=0x7fde1d951e70, ecs=0x7fde1d951e78, p=0x7fde1d951e80, te_out=0x7fde1d951e88) at fdqueue.c:438
#3  0x00007fde1f86700f in worker_thread (thd=<optimized out>, dummy=<optimized out>) at event.c:1921
#4  0x00007fde20014444 in start_thread () from /lib64/libpthread.so.0
#5  0x00007fde1fb575ed in clone () from /lib64/libc.so.6
Thread 2 (Thread 0x7fde1e153700 (LWP 6958)):
#0  0x00007fde2001a02f in pthread_cond_wait () from /lib64/libpthread.so.0
#1  0x00007fde208887cd in apr_thread_cond_wait (cond=<optimized out>, mutex=<optimized out>) at locks/unix/thread_cond.c:68
#2  0x00007fde1f86b005 in ap_queue_pop_something (queue=0x5578030c99a0, sd=0x7fde1e152e70, ecs=0x7fde1e152e78, p=0x7fde1e152e80, te_out=0x7fde1e152e88) at fdqueue.c:438
#3  0x00007fde1f86700f in worker_thread (thd=<optimized out>, dummy=<optimized out>) at event.c:1921
#4  0x00007fde20014444 in start_thread () from /lib64/libpthread.so.0
#5  0x00007fde1fb575ed in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7fde2131f780 (LWP 6956)):
#0  0x00007fde2001d0cd in read () from /lib64/libpthread.so.0
#1  0x000055780169f737 in read (__nbytes=1, __buf=0x7ffcb3cc3563, __fd=<optimized out>) at /usr/include/bits/unistd.h:44
#2  ap_mpm_podx_check (pod=<optimized out>) at mpm_unix.c:535
#3  0x00007fde1f864abb in child_main (child_num_arg=child_num_arg@entry=0, child_bucket=child_bucket@entry=0) at event.c:2368
#4  0x00007fde1f8698e5 in make_child (s=0x5578030c82d8, slot=slot@entry=0, bucket=0) at event.c:2461
#5  0x00007fde1f86997c in startup_children (number_to_start=1) at event.c:2490
#6  0x00007fde1f86a63f in event_run (_pconf=<optimized out>, plog=0x5578030cd4a8, s=0x5578030c82d8) at event.c:2857
#7  0x0000557801675f1e in ap_run_mpm (pconf=0x55780309e138, plog=0x5578030cd4a8, s=0x5578030c82d8) at mpm_common.c:94
#8  0x000055780166ef25 in main (argc=3, argv=0x7ffcb3cc3998) at main.c:783
tcp      111      0 127.0.0.1:43112         127.0.0.1:10443         ESTABLISHED 28563/./Debug/opens 
tcp6       0 815616 127.0.0.1:10443         127.0.0.1:43112         ESTABLISHED 6956/httpd    
----------------

* the listener_thread stays in this state for another 60s, and does not handle HTTP requests anymore
Comment 6 Luca Toscano 2017-06-04 08:34:22 UTC
Hi Frank,

thanks a lot for the super detailed report. I am wondering if you'd have the patience to re-run your tests with httpd 2.4.25 with the following patch:

http://home.apache.org/~ylavic/patches/httpd-2.4.x-mpm_event-wakeup-v7.1.patch

It might not solve this particular problem but I'd be curious to know if anything changes.

Thanks!
Comment 7 Frank Meier 2017-06-06 08:34:37 UTC
Hi Luca,

Thank you for looking into this issue. Unfortunately the behavior, as far as I can tell, was exactly the same with your patch, as before. Sorry.

cheers, Frank
Comment 8 Luca Toscano 2017-06-09 16:21:21 UTC
Hi Frank,

everybody is a bit busy with the upcoming 2.4.26 release but this bug will be addressed, I don't have the necessary skills but I'll find somebody soon enough :)

I tried to check event's code and everything seems originating from the periodical call to the following snipped of the listener (as it was stated previously):

            /* Step 2: write completion timeouts */
            process_timeout_queue(write_completion_q, timeout_time,
                                  start_lingering_close_nonblocking);

This is now done periodically by the listener, and when Timeout expires then start_lingering_close_nonblocking ends up in blocking as you described.

Since in this case mod_ssl will have to send a close-notify to gracefully close the TLS connection, I'd think that the listener should not have the chance to even attempt to do any work (risking to block), but just offload it to a spare worker.  

An alternative would be to figure out if modssl_smart_shutdown could avoid to block or just for a very brief amount of time to avoid blocking the listener.

We'll see how others thinks about it! I'll dig a bit more into this issue and report my findings.
Comment 9 Frank Meier 2017-06-14 06:20:09 UTC
Hi Luca,

Thanks again, for pushing this forward. Also my understanding of the code is unfortunately not good enough to propose a patch.

I completely agree. Either the process of closing should be offloaded to a worker thread, or the function "start_lingering_close_nonblocking" should really guarantee, that it is 'nonblocking' like its name suggests. I slightly favor the path, fixing the start_lingering_close_nonblocking function, but I think this might be more difficult to do. The question is, what should happen if the TLS "close notify" could not be sent. Maybe we could just close the connection uncleanly in this case. It would not be so nice, but on the other hand, at some point you want to get rid of the stalled connection and a 'hard' close might be the only way to achieve that.

Cheers
Comment 10 Luca Toscano 2017-06-21 16:08:25 UTC
Hi Frank,

for reference here's some interesting commits to the modssl smart shutdown function:

http://svn.apache.org/viewvc?view=revision&revision=1651077
https://bz.apache.org/bugzilla/show_bug.cgi?id=54998

So I'd say that it is really important for us to keep flushing the close-notify to the client.
Comment 11 Luca Toscano 2017-06-30 09:05:57 UTC
Other two interesting trunk improvements that have not been backported yet:

http://svn.apache.org/viewvc?view=revision&revision=1706669
http://svn.apache.org/viewvc?view=revision&revision=1734656

IIUC these ones are meant to provide a more async behavior to most of the output filters, namely setting aside buckets (on the heap) to avoid blocking.

After a bit of thinking it seems to me that we'd need to find a solution that prevents the mod_ssl_output filter to block, but in a safe way.

In this particular case we assume this about start_lingering_close_nonblocking:

"""
/*
 * Close our side of the connection, NOT flushing data to the client.
 * This should only be called if there has been an error or if we know
 * that our send buffers are empty.
 * Pre-condition: cs is not in any timeout queue and not in the pollset,
 *                timeout_mutex is not locked
 * return: 0 if connection is fully closed,
 *         1 if connection is lingering
 * may be called by listener thread
 */
"""

I tried the following patch:

"""
Index: server/mpm/event/event.c
===================================================================
--- server/mpm/event/event.c	(revision 1800362)
+++ server/mpm/event/event.c	(working copy)
@@ -744,10 +744,7 @@
     conn_rec *c = cs->c;
     apr_socket_t *csd = cs->pfd.desc.s;

-    if (ap_prep_lingering_close(c)
-        || c->aborted
-        || ap_shutdown_conn(c, 0) != APR_SUCCESS || c->aborted
-        || apr_socket_shutdown(csd, APR_SHUTDOWN_WRITE) != APR_SUCCESS) {
+    if (ap_prep_lingering_close(c) || c->aborted) {
         apr_socket_close(csd);
         ap_push_pool(worker_queue_info, cs->p);
         if (dying)
"""

So the idea was to brutally close the connection only if ap_prep_lingering_close(c) is not 0 or if the client has already aborted, but to leave all the other cases to the start_lingering_close_common. This is probably not enough/correct because the connection would go into the lingering_close queue, to be picked up again by process_timeout_queue(linger_q,..) after the timeout that would call stop_lingering_close, that would in turn simply close the socket without giving the possibility to mod_ssl to flush its close-notify (because no ap_shutdown_conn would be called).

Still looking for a better solution :)
Comment 12 Luca Toscano 2017-07-03 11:36:22 UTC
As FYI this bug generated a discussion on dev@:

https://lists.apache.org/thread.html/0daa4c40a4396cbb411f7657b476e2524add73cbc8ed99e67264578c@%3Cdev.httpd.apache.org%3E

It seems that there is an agreement on the safest way to proceed, namely forcing start_lingering_close_nonblocking to be executed on a worker thread.

Note for the readers: in the email thread it was pointed out that ap_prep_lingering_close (called by start_lingering_close_nonblocking before ap_shutdown_conn) could block as well if a module X hooks to pre_close_connection and blocks.
Comment 13 Yann Ylavic 2017-07-21 00:46:57 UTC
Created attachment 35156 [details]
Defer nonblocking lingering close to workers

Patch issued from above discussion.
Comment 14 Yann Ylavic 2017-07-21 16:56:16 UTC
Created attachment 35158 [details]
Defer nonblocking lingering close to workers (v3)

An update to:
- close pending lingering sockets (deferred) on ungraceful restart,
- call usual process_socket() in worker for deferred lingering closes
  the previous patch missed updating the scoreboard (SERVER_CLOSING)
  by calling start_lingering_close_blocking() directly (note this change
  required to put the clogging_input_filters case at the right place to
  avoid reentering process_connection for any state),
- set socket timeout to SECONDS_TO_LINGER (2s) for deferred lingering
  closes since they likely come from a time-up already.
Comment 15 Yann Ylavic 2017-07-22 00:13:12 UTC
Created attachment 35159 [details]
Defer nonblocking lingering close to workers (v4)

Fixes is_idle in worker threads woken up for a deferred close.
Comment 16 Yann Ylavic 2017-07-22 11:30:06 UTC
Created attachment 35160 [details]
Defer nonblocking lingering close to workers (v5)

Same as v4, plus allowing timer threads (trunk only) to handle deferred lingering closes after their work.
Comment 17 Yann Ylavic 2017-07-24 23:22:29 UTC
Committed to trunk in r1802875 (v6).
Comment 18 Yann Ylavic 2017-07-24 23:23:25 UTC
Comment on attachment 35160 [details]
Defer nonblocking lingering close to workers (v5)

Obsoleted by r1802875 (v6).
Comment 19 Luca Toscano 2017-08-04 16:35:01 UTC
Hi Frank, have you had the chance to test Yann's patch to see if it fixes the blocking issue that you reported in your testing environment?
Comment 20 Frank Meier 2017-08-07 12:40:37 UTC
Hi Lucca

sorry I was not able to apply Yann's v6 patch (r1802875) to the 2.4.x branch. There were some merge conflicts I could not resolve without understanding the code better. Is there a version of the patch that is applicable to the 2.4 branch?

cheers, Frank
Comment 21 Yann Ylavic 2017-09-18 09:17:13 UTC
Created attachment 35332 [details]
Backport of r1802875 to 2.4.x

Harmless in CHANGES resolved.
Comment 22 Frank Meier 2017-09-19 15:49:17 UTC
Hi Yann,

I tested your patch against the current 2.4.x head and I was *not* able to reproduce the issue. Whereas it is still reproducible without the patch of course.

Great work!

cheers, Frank
Comment 23 Yann Ylavic 2017-09-19 17:00:20 UTC
Thanks Franck for testing, backport to 2.4.x proposed.
Comment 24 vsamel 2017-09-20 06:03:21 UTC
(In reply to Yann Ylavic from comment #21)
> Created attachment 35332 [details]
> Backport of r1802875 to 2.4.x

My random freezes are resolved with this patch too.

Thanks.
Comment 25 Yann Ylavic 2017-09-22 10:57:50 UTC
Backported to 2.4.28 (r1809299).