Bug 39275 - slow child_init causes MaxClients warning
Summary: slow child_init causes MaxClients warning
Status: REOPENED
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: mpm_worker (show other bugs)
Version: 2.5-HEAD
Hardware: All All
: P2 normal (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords: PatchAvailable
Depends on:
Blocks:
 
Reported: 2006-04-11 21:09 UTC by Chris Darroch
Modified: 2009-10-08 15:47 UTC (History)
2 users (show)



Attachments
adds a per-process status field (1.73 KB, patch)
2006-04-11 21:10 UTC, Chris Darroch
Details | Diff
adds process_score status (358 bytes, patch)
2006-04-11 21:14 UTC, Chris Darroch
Details | Diff
a module which introduces a delay in child_init() for testing (1.56 KB, text/plain)
2006-08-30 22:14 UTC, Peter Poeml
Details
backport for 2.0.x (747 bytes, patch)
2006-09-01 21:26 UTC, Peter Poeml
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Chris Darroch 2006-04-11 21:09:19 UTC
Per this thread on the httpd-dev mailing list:

http://marc.theaimsgroup.com/?t=114453426100001&r=1&w=2

I have been seeing something similar with 2.2.0 using the worker
MPM, where with the following settings, I get over 10 child processes
initializing immediately (e.g., up to 15), and then they drop back to
10.  I see the "server reached MaxClients" message as well right
after httpd startup, although nothing is connecting yet.

<IfModule mpm_worker_module>
    StartServers         10
    MaxClients          150
    MinSpareThreads      25
    MaxSpareThreads     100
    ThreadsPerChild      10
</IfModule>

In my case, the problem relates to how long the child_init phase
takes to execute.  I can "tune" this by raising DBDMin (and DBDKeep)
so that mod_dbd attempts to open increasingly large numbers of
DB connections during child_init.  With DBDMin set to 0 or 1,
all is well; no funny behaviour.  Up at DBDMin and DBDKeep at 3,
that's when (for me) things go pear-shaped.

In server/mpm/worker/worker.c, after make_child() creates a
child process it immediately sets the scoreboard parent slot's pid
value.  The main process goes into server_main_loop() and begins
executing perform_idle_server_maintenance() every second; this
looks at any process with a non-zero pid in the scoreboard and
assumes that any of its worker threads marked SERVER_DEAD are,
in fact, dead.

However, if the child processes are starting "slowly" because
ap_run_child_init() in child_main() is taking its time, then
start_threads() hasn't even been run yet, so the threads aren't
marked SERVER_STARTING -- they're just set to 0 as the default
value.  But 0 == SERVER_DEAD, so the main process sees a lot
of dead worker threads and begins spawning new child processes,
up to MaxClients/ThreadsPerChild in the worst case.  In this case,
when no worker threads have started yet, but all possible child
processes have been spawned (and are working through their
child_init phases), then the following is true and the
"server reached MaxClients" message is printed, even though
the server hasn't started accepting connections yet:

    else if (idle_thread_count < min_spare_threads) {
        /* terminate the free list */
        if (free_length == 0) {

I considered wedging another thread status into the
scoreboard, between SERVER_DEAD (the initial value) and
SERVER_STARTING.  The make_child() would set all the thread
slots to this value and start_threads() would later flip them
to SERVER_STARTING after actually creating the worker threads.

That would have various ripple effects on other bits of
httpd, though, like mod_status and other MPMs, etc.  So instead
I tried adding a status field to the process_score scoreboard
structure, and making the following changes to worker.c such that
this field is set by make_child to SERVER_STARTING and then
changed to SERVER_READY once the start thread that runs
start_threads() has done its initial work.

During this period, while the new child process is running
ap_run_child_init() and friends, perform_idle_server_maintenance()
just counts that child process's worker threads as all being
effectively in SERVER_STARTING mode.  Once the process_score.status
field changes to SERVER_READY, perform_idle_server_maintenance()
begins to look at the individual thread status values.
Comment 1 Chris Darroch 2006-04-11 21:10:32 UTC
Created attachment 18070 [details]
adds a per-process status field
Comment 2 Chris Darroch 2006-04-11 21:14:47 UTC
Created attachment 18071 [details]
adds process_score status
Comment 3 Chris Darroch 2006-04-11 21:49:42 UTC
I should add that this patch is more food-for-thought than a clear fix.
For one thing, other MPMs like event and prefork aren't considered.
For another, hard and graceful restarts may not play well with the per-process
status field, since at least with graceful restarts, new processes can
gradually take over thread slots in the scoreboard from an exiting process.
I'll do some additional testing and perhaps a better solution will present itself.
Comments welcome!
Comment 4 Greg Ames 2006-05-03 00:34:28 UTC
svn rev. 399099 should take care of it. 
Comment 5 Peter Poeml 2006-08-30 22:10:54 UTC
Greg, svn rev. 399099 helps fine during server start. However, the same problem
occurs during a graceful restart.
Comment 6 Peter Poeml 2006-08-30 22:14:15 UTC
Created attachment 18771 [details]
a module which introduces a delay in child_init() for testing
Comment 7 Peter Poeml 2006-09-01 21:26:42 UTC
Created attachment 18806 [details]
backport for 2.0.x

This is the same fix for 2.0.x. It has the same problem as the 2.2.x
backport, as well as the fix in trunk: startup is fixed, but graceful
restart isn't.