Pb was reproduced with 2.2.6 on AIX and Solaris. Once piped log are configured (ex with rotatelogs), if you start Apache, you will find two copies of the errorlog piped program: - one has been started a few seconds prior to the second one - the first one has parent pid=1 (not httpd) and is attached to your tty Example below: UID PID PPID C STIME TTY TIME CMD root 1761522 1 0 18:28:05 - 0:00 httpd -k start xxxx 1892358 1761522 0 18:28:05 - 0:00 httpd -k start root 1077456 1 0 18:28:02 pts/12 0:00 rotatelogs errors.%Y%m%d 86400 root 2170970 1761522 0 18:28:05 - 0:00 rotatelogs errors.%Y%m%d 86400 Since the first one is attached to your tty, you can no longer exit your ssh connection easily nor automate things with remote shell commands Located the pb in server/log.c: latest revision commited in 2.2.x branch introduced the regression http://svn.apache.org/viewvc/httpd/httpd/branches/2.2.x/server/log.c? r1=569542&r2=570451&diff_format=h Note: if first piped program is killed, no prob since the second one is the one doing the job
I'm wondering if there's a way to determine if AIX picked up on the availability of RELIABLE_PIPED_LOGS... can you take a look at your configured include/apr.h file and look for a line such as; #define APR_HAS_OTHER_CHILD 1 and report back? Thanks for the report!
Another thought. Are you using worker mpm plus mod_cgid? If we've spawned off a long lived cgid or similar process during the startup, that extra fork()'ed process may be holding the stderr handle open to the very first piped logger.
Created attachment 20888 [details] Patch against trunk Can you please check if the attached patch fixes your problem?
(In reply to comment #1) > I'm wondering if there's a way to determine if AIX picked up on the The same happens on Linux. See also my attached patch. IMHO the problem is that we keep a copy of the pipe in s_main->error_log after we dup2ed it into stderr_log. This one gets inherited by the second piped rotater for the main server. Thus the first piped rotater, which is started while the server is still attached to the terminal does not shut down.
Please confirm the attached patch resolves the flaw; it will be tagged for prompt attention and target release 2.2.7, if so.
I can reproduce the problem for Solaris. The patch posted by Bill on httpd-dev fixes the problem on Solaris (apart from a missing semicolon) as well as the one you attached. I can provide full truss output, if someone needs it. Below find an excerpt with the relevant diff. This rotatelogs process gets forked from the httpd process, that does the first configuration validation. Even without the patch: during the first (graceful) restart the additional rotatelogs process terminates. Caution: I use a slightly non standard build, using the shell /usr/xpg4/bin/sh nsted of /bin/sh in /srclib/apr/include/arch/unix/apr_arch_threadproc.h: #define SHELL_PATH "/usr/xpg4/bin/sh" On Solaris one otherwise gets double processes for rotatelogs, since /bin/sh creates a second process when calles with "-c". This is a separate observation not only restricted for the error logger of the main server. truss with bug (28235 startup httpd process, 28237 rotatelogs forked by that): 28235: bind(3, 0x0007C608, 32, 3) = 0 28235: AF_INET6 name = :: port = 8001 28235: listen(3, 511, 1) = 0 ########## THIS part goes away with the patch ########## 28235: pipe() = 7 [8] 28235: fcntl(7, F_GETFL, 0x00000000) = 2 28235: fstat64(7, 0xFFBEF6D0) = 0 28235: d=0x04140000 i=9338918 m=0010000 l=0 u=1200 g=1200 sz=0 28235: at = Sep 26 20:44:54 CEST 2007 [ 1190832294 ] 28235: mt = Sep 26 20:44:54 CEST 2007 [ 1190832294 ] 28235: ct = Sep 26 20:44:54 CEST 2007 [ 1190832294 ] 28235: bsz=5120 blks=0 fs=fifofs 28235: fstat64(7, 0xFFBEF6D0) = 0 28235: d=0x04140000 i=9338918 m=0010000 l=0 u=1200 g=1200 sz=0 28235: at = Sep 26 20:44:54 CEST 2007 [ 1190832294 ] 28235: mt = Sep 26 20:44:54 CEST 2007 [ 1190832294 ] 28235: ct = Sep 26 20:44:54 CEST 2007 [ 1190832294 ] 28235: bsz=5120 blks=0 fs=fifofs 28235: fcntl(7, F_SETFL, 0x00000082) = 0 ########## END PART ########## 28235: brk(0x0014A800) = 0 28235: brk(0x0014C800) = 0 28235: pipe() = 9 [10] 28235: pipe() = 11 [12] 28235: fcntl(1, F_DUP2FD, 0x0000000B) = 11 28235: access("/opt/build/test/apache22-prefork-2.2.6_0.9.8e-1/bin/rotatelogs", 5) = 0 28235: fork1() = 28237 28237: fork1() (returning as child ...) = 28235 28237: getpid() = 28237 [28235] truss with patch (5966 httpd startup process, 5968 forked rotatelogs): 5966: bind(3, 0x0007E628, 32, 3) = 0 5966: AF_INET6 name = :: port = 8001 5966: listen(3, 511, 1) = 0 5966: pipe() = 7 [8] 5966: brk(0x0014C820) = 0 5966: brk(0x0014E820) = 0 5966: pipe() = 9 [10] 5966: pipe() = 11 [12] 5966: fcntl(1, F_DUP2FD, 0x0000000B) = 11 5966: access("/opt/build/test/apache22-prefork-2.2.6_0.9.8e-1/bin/rotatelogs", 5) = 0 5966: fork1() = 5968 5968: fork1() (returning as child ...) = 5966 5968: getpid() = 5968 [5966]
I could reproduce the same problem for 2.0.61 under Solaris. Will test patch there also, because log.c is very similar to 2.2.6.
The patch also fixes BZ 40651 for me. I could still reproduce 40651 with 2.2.6.
Same patch fixes problem for 2.0.61 prefork/Solaris.
Believe we have a line on this, with luck it will be addressed very soon as release 2.2.7.
(In reply to comment #5) > Please confirm the attached patch resolves the flaw; it will be tagged for prompt > attention and target release 2.2.7, if so. Patch solves pb on both my AIX and Solaris platforms.
Committed to trunk as r580431 (http://svn.apache.org/viewvc?rev=580431&view=rev) and proposed for backport to 2.2.x as r580434 (http://svn.apache.org/viewvc?rev=580434&view=rev).
Patch was backported ... PR wasn't closed at the time.