Summary: | Graceful restarts don't effect children in keepalive until they exit | ||
---|---|---|---|
Product: | Apache httpd-2 | Reporter: | Frank T. Lofaro Jr. <ftlofaro> |
Component: | mpm_prefork | Assignee: | Apache HTTPD Bugs Mailing List <bugs> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | andrew.punch, apache, lavr |
Priority: | P2 | Keywords: | PatchAvailable |
Version: | 2.2.4 | ||
Target Milestone: | --- | ||
Hardware: | Sun | ||
OS: | Solaris | ||
Attachments: |
keepalive.py - Python script to keep an apache process alive indefinitely by using the keepalive issue
Patch to solve this bug - based on head of 2.2.x subversion branch Patch to solve this bug - based on head of 2.2.x subversion branch, including correctly updating mpm_state Patch to solve this bug - based on head of 2.2.x subversion branch, including correctly updating mpm_state Patch to solve this bug - based on head of trunk (r1066631), including correctly updating mpm_state tweaked patch for 2.2.x |
Description
Frank T. Lofaro Jr.
2007-03-01 15:20:07 UTC
(In reply to comment #0) > Graceful should avoid killing a current request, but keepalive connections may > be killed at any time when inactive; it should kill a child when it is not > currently servicing a request. > It is not true. Some modules keep their state with connection. See bug# 41109 for instance. BTW, if You do graceful on production You need to wait any way. The prefork version of ap_graceful_stop_signalled is always false: int ap_graceful_stop_signalled(void) { /* not ever called anymore... */ return 0; } Whereas worker overloads it to mean any kind of graceful exit is happening. The core in 2.2.x uses this callback to determine if it should do keepalive before committing the headers. This appears to be resolved in trunk by using another API. via users@ may not be fixed in trunk prefork, needs testing Steps are: 1. Configure apache to use: - prefork mpm - KeepAlive On - KeepAliveTimeout 60 - MaxKeepAliveRequests 0 2. Save ps output. e.g: ( while true; do date; ps -Hfg `cat httpd.pid`; sleep 1 ; done ) > ps.log 3. Run script: keepalive.py <hostname> 4. Send USR1 to the parent process e.g. sudo kill -USR1 `cat httpd.pid`; date 5. Observe in ps.log that all the child processes exit, except for one. New child process will start 6. sudo netstat -tp will indicate that the python script is connected the the one child process that did not exit 7. Leave the system for 15 minutes or longer 8. The one child process will still not exit (check ps.log and netstat -tp) 9. Stop keepalive.py e.g. using ctrl+c 10. Observe that the one child process will exit once keepalive.py disconnects Created attachment 26556 [details]
keepalive.py - Python script to keep an apache process alive indefinitely by using the keepalive issue
Observed on Redhat Enterprise Linux 5.5 ap_graceful_stop_signalled() in http_core.c still calls ap_graceful_stop_signalled() in 2.2.X trunk. ap_process_http_async_connection() in http_core.c still calls ap_graceful_stop_signalled() in 2.2.X trunk. Created attachment 26584 [details]
Patch to solve this bug - based on head of 2.2.x subversion branch
The fix for this on the trunk was r645434, which replaced use of ap_graceful_stop_signalled() with ap_mpm_query(). This does look insufficient to fix the bug for prefork, since the prefork signal handler does not change mpm_state (prefork.c:sig_term). Which of these alternatives do you prefer: 1. Move ahead with my current patch 2. I modify prefork.c so the signal handler changes mpm_state Created attachment 26598 [details]
Patch to solve this bug - based on head of 2.2.x subversion branch, including correctly updating mpm_state
Created attachment 26599 [details]
Patch to solve this bug - based on head of 2.2.x subversion branch, including correctly updating mpm_state
Created attachment 26600 [details] Patch to solve this bug - based on head of trunk (r1066631), including correctly updating mpm_state The two patches that have been attached to this bug use the approaches outlined below. TRUNK ===== 1. Set the mpm_state to AP_MPMQ_STOPPING 2.2.x BRANCH ============ 1. Set the mpm_state to AP_MPMQ_STOPPING 2. Return the correct value from ap_graceful_stop_signalled() I considered rewriting http_core.c and http_protocol.c in 2.2.x to use ap_mpm_query(). However third party modules may use ap_graceful_stop_signalled(), so it needed to be fixed anyway, and I didn't want to risk breaking code in http_core and http_protocol that was working well. *All* other mpms support ap_graceful_stop_signalled() in 2.2.x, except Netware. OTHER MPMS ========== I had a quick check through other 2.2.x MPMs and noticed that the Netware MPM appears to have the same issue as prefork. I am not a netware expert, so it might be good to have someone check that out. Thanks a lot for the patches, Andrew. I tweaked the trunk patch slightly - static void just_die(int sig) { + mpm_state = AP_MPMQ_STOPPING; clean_child_exit(0); was redundant since clean_child_exit sets mpm_state anyway. Committed to trunk in r1068389 Created attachment 26623 [details]
tweaked patch for 2.2.x
Slightly tweaked version of 2.2.x patch for review.
My quick testing confirms the latest trunk (r1068671), which includes the patch. fixes the problem. As mentioned by Joe the 2.2.x branch patch is still waiting. 2.2.x patch committed in r1069428 I can confirm that the 2.2.x patch is now in the 2.2.x and my testing indicates that the 2.2.x branch no longer has the problem. Thanks Joe and thanks to my colleague James "Gerbs" Byrne who diagnosed this problem. *** Bug 38994 has been marked as a duplicate of this bug. *** *** Bug 47635 has been marked as a duplicate of this bug. *** |