50702 – Apache child process could crash on shutdown

Bug 50702 - Apache child process could crash on shutdown

Summary: Apache child process could crash on shutdown

Status:	RESOLVED DUPLICATE of bug 23238

Alias:	None

Product:	Apache httpd-2
Classification:	Unclassified
Component:	Core (show other bugs)
Version:	2.2.15
Hardware:	PC Linux

Importance:	P2 major (vote)
Target Milestone:	---
Assignee:	Apache HTTPD Bugs Mailing List

URL:
Keywords:

Depends on:
Blocks:

Reported:	2011-02-01 15:05 UTC by Ahab. A.
Modified:	2011-02-02 08:50 UTC (History)
CC List:	0 users

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Ahab. A. 2011-02-01 15:05:42 UTC

We have seen core files on Apache shutdown. We were able to see this on two different machines, one a VM and another a real machine. This crash does not happen all the time but seems like chances of happening when machine is very busy and under heavy load. I was lucky to get purify catch one child process doing all this bad stuff..

We've looked into Apache code and it looks like apr_pool_destroy() called from clean_child_exit() doesn't protect itself if it got called twice. Looking at the core file(s) (pasted call stack below), there were all consistent that a child process is exiting AND had called apr_pool_destroy() already for clean up only to get signaled again (by sig #15 SIGTEM, most likly from parent) which makes it jump into just_die() then clean_child_exit() then apr_pool_destroy() then CRASH. Maybe a possible solution is apr_pool_destroy() protect itself from multiple calls or the child ignores the signal if it performed cleanup already.

I looked at 2.2.17 and 2.3.10 alpha source but didn't see any code changes to protect apr_pool_destroy() from being called twice. However, I can only report 2.2.15 as that's were we've seen the core dumps..

Here are stack traces:

CORE #1: frames 19 and 8 shows multiple cleanup attempts

Program terminated with signal 6, Aborted.
#0  0x00e58402 in __kernel_vsyscall ()
(gdb) where
#0  0x00e58402 in __kernel_vsyscall ()
#1  0x009afc00 in raise () from /lib/libc.so.6
#2  0x009b1451 in abort () from /lib/libc.so.6
#3  0x009e521b in __libc_message () from /lib/libc.so.6
#4  0x009ecf7d in _int_free () from /lib/libc.so.6
#5  0x009f05d0 in free () from /lib/libc.so.6
#6  0x0082727d in apr_allocator_destroy (allocator=0x8cf1800)
    at memory/unix/apr_pools.c:134
#7  0x00827e2d in apr_pool_destroy (pool=0x8cf1888)
    at memory/unix/apr_pools.c:829
#8  0x080c0c64 in clean_child_exit (code=0) at prefork.c:196
#9  0x080c0c8d in just_die (sig=15) at prefork.c:328
#10 <signal handler called>
#11 0x001b4143 in ?? () from /lib/libcrypto.so.6
#12 0x002a0780 in STORE_object_type_string () from /lib/libcrypto.so.6
#13 0x002a3f74 in ?? () from /lib/libcrypto.so.6
#14 0xbf90d308 in ?? ()
#15 0x0026810c in _fini () from /lib/libcrypto.so.6
#16 0x0026810c in _fini () from /lib/libcrypto.so.6
#17 0x009784b2 in _dl_fini () from /lib/ld-linux.so.2
#18 0x009b27f9 in exit () from /lib/libc.so.6
#19 0x080c0c79 in clean_child_exit (code=0) at prefork.c:200
#20 0x080c10f0 in child_main (child_num_arg=<value optimized out>)
---Type <return> to continue, or q <return> to quit---
    at prefork.c:212
#21 0x080c1337 in make_child (s=0x8c2ca88, slot=5) at prefork.c:758
#22 0x080c1c90 in ap_mpm_run (_pconf=0x8c270a8, plog=0x8c651a0, s=0x8c2ca88)
    at prefork.c:893
#23 0x08069cc5 in main (argc=146952352, argv=0x8c59170) at main.c:740

CORE #2: frame 7 and 0 shows multiple destroy of apr_pool.

Program terminated with signal 11, Segmentation fault.
#0  apr_pool_destroy (pool=0x8e18080) at memory/unix/apr_pools.c:357
357             next = node->next;
(gdb) where
#0  apr_pool_destroy (pool=0x8e18080) at memory/unix/apr_pools.c:357
#1  0x080c1134 in clean_child_exit (code=0) at prefork.c:196
#2  0x080c115d in just_die (sig=15) at prefork.c:328
#3  <signal handler called>
#4  0x009c3344 in _int_free () from /lib/libc.so.6
#5  0x009c6fc0 in free () from /lib/libc.so.6
#6  0x00d6788d in apr_allocator_destroy (allocator=0x8e17ff8)
    at memory/unix/apr_pools.c:134
#7  0x00d6844d in apr_pool_destroy (pool=0x8e18080)
    at memory/unix/apr_pools.c:829
#8  0x080c1134 in clean_child_exit (code=0) at prefork.c:196
#9  0x080c15b5 in child_main (child_num_arg=<value optimized out>)
    at prefork.c:212
#10 0x080c17fa in make_child (s=<value optimized out>, slot=2) at prefork.c:758
#11 0x080c18ba in startup_children (number_to_start=3) at prefork.c:776
#12 0x080c2306 in ap_mpm_run (_pconf=0x8d4e550, plog=0x8d8c648, s=0x8d53f30)
    at prefork.c:997
#13 0x080698df in main (argc=148161864, argv=0x8d80618) at main.c:740

Comment 1 Joe Orton 2011-02-02 08:50:44 UTC


*** This bug has been marked as a duplicate of bug 23238 ***