Bug 65803 - Apache hangs after the only one Server gets terminated
Summary: Apache hangs after the only one Server gets terminated
Status: RESOLVED DUPLICATE of bug 65769
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: mpm_event (show other bugs)
Version: 2.4.52
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-01-15 02:21 UTC by Viacheslav Shegai
Modified: 2022-01-16 12:37 UTC (History)
0 users



Attachments
premilinary patch (1.76 KB, patch)
2022-01-15 02:27 UTC, Viacheslav Shegai
Details | Diff
Fix startup_children() grace period (4.21 KB, patch)
2022-01-15 14:21 UTC, Yann Ylavic
Details | Diff
Fix startup_children() grace period (4.35 KB, patch)
2022-01-15 14:47 UTC, Yann Ylavic
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Viacheslav Shegai 2022-01-15 02:21:27 UTC
Overview:

Server hangs after some time.

Steps to reproduce:

MPM configuration:

<IfModule event.c>
    MaxRequestWorkers       32
    ServerLimit             1
    ThreadLimit             32
    ThreadsPerChild         32
    MinSpareThreads         8
    MaxSpareThreads         40
    MaxConnectionsPerChild  10000000
</IfModule>

Additional Information

I dug a little and here's what i've found: There's a function called startup_children in event.c and it can skip creation of children (make_child call) because of (ap_scoreboard_image->parent[i].pid != 0).

When using this function in server_main_loop function the remaining_children_to_start variable is set to zero. So if there were no child creation and remaining_children_to_start is set to zero, no more child ever created. Server hung.

So my proposal probably to modify startup_children function to return actual number of remaining children. Not just set it to zero unconditionally.

I am not apache dev. Please correct me.
Comment 1 Viacheslav Shegai 2022-01-15 02:27:31 UTC
Created attachment 38161 [details]
premilinary patch
Comment 2 Yann Ylavic 2022-01-15 14:21:07 UTC
Created attachment 38163 [details]
Fix startup_children() grace period

If startup_children() does not create all the requested children it means that the scorebord is full (which is the case with "ServerLimit 1" and the only possible child already gracefully stopping). In this case I don't think that httpd should keep calling startup_children() every second (which attachment 38161 [details] does not prevent anymore), yet it should keep replacing exiting processes one by one (which is rightly addressed by the patch).

This new patch should address both cases, using the gobal ap_daemons_to_start in startup_children() makes it bigger but simplifies the handling of StartServers overall (IMO), while still tracking "startup_children() called once by server_main_loop()" locally.
Does it work for you?
Comment 3 Eric Covener 2022-01-15 14:46:36 UTC
Yann can you explain why perform_idle_server_maintenance doesn't recover even if the startup_children path fails here?
Comment 4 Yann Ylavic 2022-01-15 14:47:37 UTC
Created attachment 38164 [details]
Fix startup_children() grace period

Less intrusive version of attachment 38163 [details], please consider this one instead.
Comment 5 Yann Ylavic 2022-01-15 15:00:10 UTC
(In reply to Eric Covener from comment #3)
> Yann can you explain why perform_idle_server_maintenance doesn't recover
> even if the startup_children path fails here?

It doesn't because MinSpareThreads is rounded up to ThreadsPerChild, so on graceful if the child (here ServerLimit 1) exits after the one second grace period in server_main_loop() then remaining_children_to_start is reset but perform_idle_server_maintenance() will never create any child because idle_thread_count == MinSpareThreads (not < MinSpareThreads).
Comment 6 Yann Ylavic 2022-01-15 15:08:02 UTC
Or so I thought, acutally it seems that MinSpareThreads is not rounded up like this.. Hm, let me reproduce then.
Comment 7 Yann Ylavic 2022-01-15 16:41:54 UTC
Well, scratch that, idle_thread_count == 0 whenever the unique child exits thus perform_idle_server_maintenance() does its job. So I can't reproduce.
Sorry for the confusion and the broken mental reconstruction of the issue.

The more likely is that it's the same issue as bug 65769, Viacheslav could you please try the patch from r1896505 ([1])?

[1] https://svn.apache.org/viewvc/httpd/httpd/trunk/server/mpm/event/event.c?r1=1896505&r2=1896504&pathrev=1896505&view=patch
Comment 8 Viacheslav Shegai 2022-01-15 19:57:07 UTC
(In reply to Yann Ylavic from comment #7)
> Well, scratch that, idle_thread_count == 0 whenever the unique child exits
> thus perform_idle_server_maintenance() does its job. So I can't reproduce.
> Sorry for the confusion and the broken mental reconstruction of the issue.
> 
> The more likely is that it's the same issue as bug 65769, Viacheslav could
> you please try the patch from r1896505 ([1])?
> 
> [1]
> https://svn.apache.org/viewvc/httpd/httpd/trunk/server/mpm/event/event.
> c?r1=1896505&r2=1896504&pathrev=1896505&view=patch

Thanks. Can not reproduce the issue anymore with the patch provided.
Comment 9 Yann Ylavic 2022-01-16 12:37:20 UTC

*** This bug has been marked as a duplicate of bug 65769 ***