Hi, We are getting occasional httpd coredumps when doing performance/stress testing of our module (like 1 core per few hours of stress test). We are using apache 2.2.2 with worker mpm. Based on the cores, it always crashes in apache code. It is obviously some memory corruption: SIGSEGV, SIGBUS (like 50% together - usually on something related to apr_buckets), and more often SIGILL (like 50%) with always same backtrace as bellow. (dbx) where current thread: t@1 [1] 0x6a2808(0x6a07c0, 0xfeebc008, 0x2f40c0, 0xfeebc008, 0x0, 0x19ce54), at 0x6a2808 [2] apr_pool_destroy(0x19cce0, 0x0, 0x0, 0x0, 0x19cce0, 0x0), at 0xff0db8c8 [3] child_main(0xa, 0xf0ca8, 0xf88f4, 0xebe90, 0xf0ce4, 0xf8990), at 0xb13f0 [4] perform_idle_server_maintenance(0x109ca8, 0x0, 0xf0cb8, 0x0, 0xebe90, 0xf8918), at 0xb1e48 [5] server_main_loop(0x0, 0xffffffff, 0x3, 0xfeac0050, 0xef968, 0xebe90), at 0xb226c [6] ap_mpm_run(0x11400, 0x0, 0xf8990, 0xda7d8, 0xebe90, 0xf8924), at 0xb25ec [7] main(0xee8a0, 0x0, 0xef9dc, 0xef9ec, 0x100498, 0xebe90), at 0x2c3cc Actually, we though (and still think) that it is be some bug in our module code, but with some additional testing (using Parasoft Insure + apache-2.2.2 build with debug info) we got assert crash (SIGABRT) in httpd with this report (same as core backtrace) – see bellow (got 3x same core during 3-day test with server perma under 100% load). ... "unknown", line unknown: Insure trapped signal: 6 Stack trace where the error occurred: __sigprocmask() sigacthandler() _sigon() _thrp_kill() raise() abort() abort() (interface) ap_log_assert() log.c, 778 ap_queue_push() /tmp/apache-2.2.2.rousalm.build/httpd-2.2.2/server/mpm/worker/fdqueue.c, 294 listener_thread() worker.c, 755 dummy_worker() threadproc/unix/thread.c, 138 "unknown", line unknown: Insure trapped signal: 6 ... the problem seems to be in: apr_status_t ap_queue_push(fd_queue_t *queue, apr_socket_t *sd, apr_pool_t *p) { ... AP_DEBUG_ASSERT(!ap_queue_full(queue)); elem = &queue->data[queue->nelts]; elem->sd = sd; elem->p = p; queue->nelts++; ... } from core I can see that both 'queue->nelts' and 'queue->bounds' are equal to 10 (see config bellow), so the queue is full, but apache tries to add new connection to it (+ there is no nondebug error check except on mutex lock/unlock failure), without the assert this can for sure cause memory corruption. Our httpd.conf worker config looks like this: ... <IfModule mpm_worker_module> StartServers 4 MaxClients 150 MinSpareThreads 40 MaxSpareThreads 80 ThreadsPerChild 10 MaxRequestsPerChild 10000 </IfModule> ... I had no time to check all the code related to worker_queue access, so it is still quite possible that this "bug" is caused by some previous problem caused by our module. As there is only assert check I expect this (queue is full) should not happen under normal circumstances. Regards, Michal
This looks like PR#44402 - closing as duplicate. If the fix to that doesn't work, you can reopen. *** This bug has been marked as a duplicate of bug 44402 ***