Bug 57399

Summary: Many unnecessary CPU wakeups per second
Product: Apache httpd-2 Reporter: Anders Kaseorg <andersk>
Component: mpm_eventAssignee: Apache HTTPD Bugs Mailing List <bugs>
Status: RESOLVED FIXED    
Severity: normal CC: stefan, szg0000, toscano.luca, ylavic.dev
Priority: P2 Keywords: FixedInTrunk
Version: 2.4.10   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Attachments: Introduce APR_POLLSET_WAKEABLE in mpm-event
mpm_event's listener wakeup
mpm_event's listener wakeup (with fudge factor)
mpm_event's listener wakeup (with fudge factor) for 2.4.x
mpm_event's listener wakeup v5 (trunk)
mpm_event's listener wakeup v5 (2.4.x)
mpm_event's listener wakeup v6 (2.4.x)
mpm_event's listener wakeup v7 (2.4.x)

Description Anders Kaseorg 2014-12-28 22:41:41 UTC
PowerTOP shows httpd waking up 20 times per second on a completely idle server.  This is wasteful of CPU time and power, especially on a laptop or a virtual machine.  The biggest culprit appears to be the following code in server/mpm/event/event.c:

        apr_thread_mutex_lock(g_timer_skiplist_mtx);
        te = apr_skiplist_peek(timer_skiplist);
        if (te) {
            if (te->when > now) {
                timeout_interval = te->when - now;
            }
            else {
                timeout_interval = 1;
            }
        }
        else {
            timeout_interval = apr_time_from_msec(100);
        }
        apr_thread_mutex_unlock(g_timer_skiplist_mtx);

        rc = apr_pollset_poll(event_pollset, timeout_interval, &num, &out_pfd);

which says if there’s no reason to wake up, wake up every 100 ms anyway.  apr_time_from_msec(100) should be replaced with -1.
Comment 1 Eric Covener 2014-12-29 13:17:18 UTC
> which says if there’s no reason to wake up, wake up every 100 ms anyway. 
> apr_time_from_msec(100) should be replaced with -1.

Wouldn't the server then miss timer events registered and fired while there is no poll activity in the server?  Maybe we could document the limited features that use timers and expose the 100ms number as configuration.
Comment 2 Anders Kaseorg 2014-12-29 20:46:17 UTC
Looking at that code, if replacing 100 ms with -1 would cause the server to miss timers registered when no timers are in progress, then it must already be the case that the server will miss timers registered when a longer timer is in progress.  I assumed that, for correctness, anything that registers an earlier timer must also arrange for the listener thread to break out of its current apr_pollset_poll (either via an FD event or via a signal) so that it can be restarted with a lower timeout.  If that isn’t already the case, it should be fixed either way.
Comment 3 Luca Toscano 2016-08-12 06:53:04 UTC
Created attachment 34142 [details]
Introduce APR_POLLSET_WAKEABLE in mpm-event

Adding a patch that should solve the problem, but we'd need to:

1) verify that APR_POLLSET_WAKEABLE is reliable and suitable for mpm-event;
2) test test test
3) verify that if the pollset does not support APR_POLLSET_WAKEABLE the behavior will still be correct.

Thanks a lot for bringing up the issue!
Comment 4 Luca Toscano 2016-08-12 13:44:43 UTC
Something that may be an issue: the calls to process_timeout_queue to manage Keep Alives, Lingering close and Write completion sockets.

The function seems to be called after each wake up, so sleeping indefinitely while idle could lead to miss some events.
Comment 5 Yann Ylavic 2016-08-22 08:48:17 UTC
Created attachment 34167 [details]
mpm_event's listener wakeup

This patch makes full use of wakeups to avoid active polling and locking in the listener's (fast) path.

Would you please test it Anders?
Comment 6 Anders Kaseorg 2016-08-22 09:39:10 UTC
(In reply to Yann Ylavic from comment #5)
> Would you please test it Anders?

Thanks, that’s a massive improvement.  Wakeups are reduced from ~30 per second to 1 per second.  That remaining one is the apr_sleep(apr_time_from_sec(1)) in ap_wait_or_timeout, called from server_main_loop.  Should I file another issue for that?
Comment 7 Yann Ylavic 2016-08-22 12:02:14 UTC
Thanks for testing.

Regarding ap_wait_or_timeout(), waking up waitpid() is another beast, possibly with a signal (but signals and threads...).

This PR looks fine (generic enough) to handle/discuss the case, I think we can keep it open for now.
Comment 8 Yann Ylavic 2016-09-18 20:51:25 UTC
Created attachment 34263 [details]
mpm_event's listener wakeup (with fudge factor)

Patch updated to handle a timeout fudge factor preventing multiple near wakeups within the same 0.1 second (queues), or 0.01 second (timers).
The former could possibly/later be increased to 0.5s sinces queues have a granularity of 1s anyway.
Comment 9 Yann Ylavic 2016-09-18 20:54:34 UTC
Created attachment 34264 [details]
mpm_event's listener wakeup (with fudge factor) for 2.4.x
Comment 10 Yann Ylavic 2016-09-24 13:49:19 UTC
Created attachment 34300 [details]
mpm_event's listener wakeup v5 (trunk)

Fix !listener_is_wakeable case on graceful restart/shutdown (poll() timeout may be infinite while all workers are idle).
Comment 11 Yann Ylavic 2016-09-24 13:54:21 UTC
Created attachment 34301 [details]
mpm_event's listener wakeup v5 (2.4.x)

Same fix + r1732228 from trunk (otherwise good_methods[1:2] are not recognized as wakeable on 64bit systems, i.e. Solaris' port and Linux' epoll).
Comment 12 Yann Ylavic 2016-09-28 22:13:49 UTC
Committed to trunk in r1762718.
Comment 13 Luca Toscano 2016-11-26 08:52:20 UTC
Anybody that wants to test this patch and provide feedback is more than welcome. Eventually it will become a 2.4.x backport proposal but more testing is probably needed (especially on multiple os/platforms).
Comment 14 Stefan Priebe 2016-12-26 20:19:49 UTC
This one no longer applies to httpd 2.4.25.
Comment 15 Yann Ylavic 2016-12-26 23:15:25 UTC
Created attachment 34557 [details]
mpm_event's listener wakeup v6 (2.4.x)

Merges the following commits from trunk:
- r1762580,
- r1762701,
- r1762702,
- r1762718,
- r1762742,
- r1762743,
- r1774538.

Thanks Stefan for testing!
Comment 16 Stefan Priebe 2017-01-17 20:53:37 UTC
Hi Yann,

while testing V6 i'm experiencing segfaults.

exit signal Segmentation

server-error.log:
AH00052: child pid 14110 exit signal Segmentation fault (11)

currently i'm trying to grab a core dump.
Comment 17 Yann Ylavic 2017-01-18 16:47:47 UTC
Created attachment 34641 [details]
mpm_event's listener wakeup v7 (2.4.x)

v6 + r1779354 (fixes queues/pollset race, hopefully).
Comment 18 Yann Ylavic 2017-10-03 11:58:05 UTC
Backported to 2.4.28 in r1802146.