11427 – MPM Memory Leak in child bucket allocation

Bug 11427 - MPM Memory Leak in child bucket allocation

Summary: MPM Memory Leak in child bucket allocation

Status:	RESOLVED FIXED

Alias:	None

Product:	Apache httpd-2
Classification:	Unclassified
Component:	mpm_winnt (show other bugs)
Version:	2.5-HEAD
Hardware:	PC Windows XP

Importance:	P3 major with 17 votes (vote)
Target Milestone:	---
Assignee:	Apache HTTPD Bugs Mailing List

URL:
Keywords:	FAQ

Depends on:
Blocks:

Reported:	2002-08-02 16:30 UTC by jerod
Modified:	2008-01-05 10:30 UTC (History)
CC List:	1 user (show)

Attachments
Win32DisableAcceptEx patch (777 bytes, patch) 2008-01-05 07:54 UTC, Tom Donovan	Details \| Diff
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description jerod 2002-08-02 16:30:21 UTC

Has anyone else experienced Apache sapping memory on Win NT/2K/XP with each 
connection? If you watch your task manager one of the two Apache.exe's memory 
usage increases. The memory doesn't release after the connection is done, it 
only releases after a restart of the Apache service. If there is a workaround 
or a fix available I would appreciate hearing it.

Comment 1 William A. Rowe Jr. 2002-08-15 21:33:17 UTC

  Please test 2.0.40 and see if you continue to observe the same behavior.

Comment 2 William A. Rowe Jr. 2002-10-14 19:39:10 UTC

  No response from submittor, assuming issue is closed at least by 2.0.43.

Comment 3 Alex Varju 2002-11-23 23:04:48 UTC

I am able to reproduce this problem.  With Apache 2.0.43 on Windows 2000 Pro I
see the memory used by Apache.exe grow like crazy when I run CGI scripts.  The
more text printed to STDOUT in the script, the faster the memory growth.

This does not seem to occur if I access static pages, no matter how large they are.

I have almost the exact same configuration on Linux, Solaris, and FreeBSD, and
this problem does not occur.

Comment 4 Irmund Thum 2002-11-30 15:53:23 UTC

Observations for an SSL-vhost on Win2k with
Apache/2.0.43 (Win32) mod_ssl/2.0.43 OpenSSL/0.9.6g PHP/4.2.4-dev
I've proofed the system, httpd.conf and php.ini, opened the server-status.
php scripts have been cleaned up due to not deleted temporary files;
behavior is better than before, but during many simultaneous requests - most
sites are php scripts - the server seems to hang, CPU load goes to near 100
percent, memory usage may have doubled from 250 over 500 MB RAM.
I've instructed a local admin to stop and start the server (not graceful) in
such a situation. Otherwise it may take 10 to 20 minutes until the server
cleans up. No request is possible in this phase. Server status shows a lot of
long running reading requests. Example:
0-1 1064 0/461/461 R 734 0 0.0 1.71 1.71 ? ? ..reading.. 
0-1 1064 0/0/403 R  756 0 0.0 0.00 1.32  ? ? ..reading..

"Fortunately" this happens mostly only once a day during business hours at
Monday and Thursday when many customers are active.
php scripts are in further cleanup process, but that looks more like a bug in
Apache and/or Win2k.
(i.t 2002-11)

Comment 5 Alex Varju 2002-12-10 21:11:36 UTC

I've just checked out the 2.1-dev branch from CVS, and the problem still occurs.
 The easiest way to reproduce this is to do the following:

1. Create CGI that sends output to stdout.  e.g.:

  #!/usr/bin/perl -I..

  print "Content-type: text/html\n\n";
  for ($i = 0; $i < 2000; $i++)
  {
    printf("%1024s", "asfdasdf");
  }

2. ScriptAlias this script so that it is URL executable.

3. Go to that URL with your browser, and then use the keyboard command to reload
the page (e.g. F5 in IE).  Hold down the reload key, and you should see a number
of processes spawned simultaneously.  Let go after 15-20 seconds, and allow the
processes to finish.  The look at Apache's memory footprint ... it should be
larger.  Repeat test, watch Apache grow.

Comment 6 Rich Sawin 2003-02-21 21:14:36 UTC

I am having a similar problem with Apache 2.0.44 on Windows NT.  Watching the
process list on the NT Task Manager shows a slow but steady increase in the
amount of memory taken by Apache.exe. 

This continues until Apache stops serving pages (the rest of the Machine seems
to work Ok) and the error log records "Server ran out of Threads to server
request.  Consider raising the ThreadsperChild setting".  This machine does not
handle much traffic (maybe 20 hits per hour).

Restarting Apache brings the memory usage back down and the slow increase begins
again.

Comment 7 Alex Varju 2003-04-04 18:16:32 UTC

I don't know if this is the best solution, but I did manage to resolve this on
our servers with the following change:

Index: server/mpm/winnt/child.c
===================================================================
RCS file: /webct1/cvsroot/external/apache2/server/mpm/winnt/child.c,v
retrieving revision 1.4
retrieving revision 1.5
diff -u -p -u -r1.4 -r1.5
--- server/mpm/winnt/child.c	1 Apr 2003 22:00:37 -0000	1.4
+++ server/mpm/winnt/child.c	3 Apr 2003 00:06:59 -0000	1.5
@@ -123,6 +123,7 @@ AP_DECLARE(void) mpm_recycle_completion_
     if (context) {
         apr_pool_clear(context->ptrans);
         context->next = NULL;
+        context->ba = apr_bucket_alloc_create(context->ptrans);
         ResetEvent(context->Overlapped.hEvent);
         apr_thread_mutex_lock(qlock);
         if (qtail) {
@@ -205,7 +206,7 @@ AP_DECLARE(PCOMP_CONTEXT) mpm_get_comple
                 apr_pool_tag(context->ptrans, "ptrans");
  
                 context->accept_socket = INVALID_SOCKET;
-                context->ba = apr_bucket_alloc_create(pchild);
+                context->ba = apr_bucket_alloc_create(context->ptrans);
                 apr_atomic_inc(&num_completion_contexts); 
                 break;
             }
@@ -440,7 +441,7 @@ static PCOMP_CONTEXT win9x_get_connectio
         context = apr_pcalloc(pchild, sizeof(COMP_CONTEXT));
         apr_pool_create(&context->ptrans, pchild);
         apr_pool_tag(context->ptrans, "ptrans");
-        context->ba = apr_bucket_alloc_create(pchild);
+        context->ba = apr_bucket_alloc_create(context->ptrans);
     }
     
     while (1) {

Comment 8 Christophe LEITIENNE 2003-07-02 07:07:57 UTC

I can reproduce this problem under a Linux RedHat 9.0, P200 with 128MB.
Apache is configured to use prefork MPM.

When serving big files with PHP or cgi shell scripts, the child process memory 
size grows proportionnaly to the amount of data sent.
The size of each child process (if not used) then decreases *very* slowly.

Comment 9 Joe Orton 2004-06-03 21:57:54 UTC

If Alex's analysis is correct per patch, this is a bug in the way the WinNT MPM
creates the bucket allocator; Christophe, you're probably seeing a different issue.

Comment 10 Matt Whitlock 2004-10-18 17:45:27 UTC

I am also seeing this bug.  Apache will start out at around 18 MB of memory 
usage and slowly grow until it chokes the server to death.  I have seen Apache 
at over 200 MB of memory usage, but I'm sure it gets even higher since my 
server has actually BSOD'd due to its RAM being exhausted.

I am using Apache 2.0.52 and PHP 5.0.2 as a module.  It appears to consume a 
little bit more memory with every request.  Not sure if it grows on static 
pages or just PHP pages, but I know that this was also a problem with PHP 4.3.2 
and 4.3.9 on Apache 2.0.52.

Comment 11 Jean-Paul Horn 2005-02-14 12:21:15 UTC

Still seeing this aswell with Apache 2.0.52 and PHP 5.0.3, with same behaviour 
as previous posters. Is the patch applied in 2.0.53 and if not, what are the 
instructions to do so myself?

Comment 12 Laura J Bryson 2005-03-01 17:33:54 UTC

We are experiencing a very similar issue.  Apache.exe starts out around 30 M,
and has grown to nearly 800 MB before we have had to restart.  We are running
Apache 2.0.47 with SSL, mod_proxy, and PHP 4.3.4.4 (module) on Windows 2000
Advanced Server Service Pack 4.

Comment 13 Tim 2006-01-13 13:09:21 UTC

Yep, I have Apache 2.0.55 with all other last updates of windows and PHP.
The server doesn't free some allocated memory somewhere and the memory usage 
increase till the server crash and automatically reboots.
I have seen in the error log file. After each crash, the number of  "workers" 
are reduced with an announcement that there is not enough memory to start all 
workers until the server simply hangs.
I think that the problem is coming from the "mod_perl" extension on windows 
since I get twice these lines:
[error] <none>=HASH(0x5144000)
each time before the party starts ;))
I will test it by removing the "mod_perl" from the server and let it run for 
few days and eventually, I will add my test results here

Comment 14 William A. Rowe Jr. 2007-12-22 11:35:22 UTC

This bug was several different things (as tends to happen) but the
patch provided looks like a prime target to correct the scope of
the allocator.  Considering this patch for commit.

Comment 15 William A. Rowe Jr. 2007-12-28 23:58:47 UTC

Alex, this was definitely one aspect of the flaw, and your patch was
correct and committed for the next 2.0 and 2.2 releases in the coming week.

Remains to be seen if this was the entire scope of misallocations by the
winnt mpm.  Marking as needinfo for feedback from users.

PLEASE keep in mind that the children are supposed to grow to the maxmemfree
and remain at a steady-state of memory allocation.  Which means, if you repeat
a stress test a number of times, the memory should cease to grow.  Limiting
the number of workers to something smaller, such as 5 or 25, will help observe
this more carefully.

Once you determine the steady-state of the server uses too much memory, it's
time to consider lowering the number of worker threads.

Comment 16 Tom Donovan 2008-01-05 07:54:02 UTC

Created attachment 21348 [details]
Win32DisableAcceptEx patch

The current patch (change 607393 to trunk / 607394 to 2.2.x) disables the
Win32DisableAcceptEx directive and Win9x platforms because
win9x_get_connection() destroys the bucket allocator whenever it clears the
trans pool.

Patch (trunk) to create a new bucket allocator for each trans.

Comment 17 William A. Rowe Jr. 2008-01-05 10:30:06 UTC

Tom, your patch was spot-on.  Your patch from comment #16 applied for trunk,
2.2 branch (for 2.2.8) and 2.0 branch (for 2.0.63).  Thanks!

Now marking this FIXED; if there are other unrelated memory leaks found they
should be assigned their own incidents.