Bug 59762 - after poll error, listener thread tight loops while workers are shutting down
Summary: after poll error, listener thread tight loops while workers are shutting down
Status: NEW
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: mpm_event (show other bugs)
Version: 2.5-HEAD
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-06-27 15:56 UTC by Eric Covener
Modified: 2016-09-26 21:18 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Eric Covener 2016-06-27 15:56:05 UTC
after poll error, listener thread tight loops while workers are shutting down (including over a CRIT message).

Seems like it may just be short a break, but maybe not until after a back-to-back poll error? Probably need to review what errors are possible in different pollset providers and if any are really recoverable (minimally, only log the CRIT message once!)
Comment 1 Yann Ylavic 2016-06-29 23:41:18 UTC
Which AH number for the CRIT message ?
Comment 2 Eric Covener 2016-06-29 23:57:56 UTC
(In reply to Yann Ylavic from comment #1)
> Which AH number for the CRIT message ?

whoops, 03267
Comment 3 Yann Ylavic 2016-06-30 00:11:34 UTC
Thanks, maybe the errno too? :p
Comment 4 Eric Covener 2016-06-30 00:48:25 UTC
In the repeated log I saw, it was EINVAL, which comes from msgrcv() in the "asio" pollset provider.  EINVAL or EBADF would probably be similar for epoll (something has been clobbered and isn't going to get any better)
Comment 5 Eric Covener 2016-09-02 18:01:22 UTC
Maybe something like this to help contain the damage on a rare case where poll stops working:

http://people.apache.org/~covener/patches/event-poll_failure.diff
Comment 6 Yann Ylavic 2016-09-14 20:14:04 UTC
(In reply to Eric Covener from comment #5)

Looks good, but is it still needed after r1759011?
Comment 7 Eric Covener 2016-09-26 21:18:42 UTC
(In reply to Yann Ylavic from comment #6)
> (In reply to Eric Covener from comment #5)
> 
> Looks good, but is it still needed after r1759011?

Haha, yes this little bug caused big pain and is likely the reason I saw this looping on a real system.  It seems a little unnecessarily risky but maybe I'll leave it linger for a while here.