Summary: | httpd hung at __lll_lock_wait / pthread_mutex_lock | ||
---|---|---|---|
Product: | Apache httpd-2 | Reporter: | Seann Herdejurgen <seann> |
Component: | mod_http2 | Assignee: | Apache HTTPD Bugs Mailing List <bugs> |
Status: | NEW --- | ||
Severity: | major | CC: | deuba, seann |
Priority: | P2 | ||
Version: | 2.4.53 | ||
Target Milestone: | --- | ||
Hardware: | PC | ||
OS: | Linux | ||
Attachments: |
pstack output
pmap output lsof output pstack output Perl trace output |
Description
Seann Herdejurgen
2022-10-17 11:54:38 UTC
Created attachment 38417 [details]
pmap output
Created attachment 38418 [details]
lsof output
What I can see in the pstack output is many threads like this: Thread 73 (Thread 0x7f8842ffd640 (LWP 1887577) "httpd"): #0 0x00007f887e69c3ba in __futex_abstimed_wait_common () from /lib64/libc.so.6 #1 0x00007f887e69ebc0 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libc.so.6 #2 0x00007f887dbd6b73 in modperl_tipool_pop () from /etc/httpd/modules/mod_perl.so #3 0x00007f887dbd6c8b in modperl_interp_get () from /etc/httpd/modules/mod_perl.so #4 0x00007f887dbd6fce in modperl_interp_select () from /etc/httpd/modules/mod_perl.so #5 0x00007f887dbde587 in modperl_config_req_cleanup () from /etc/httpd/modules/mod_perl.so #6 0x00007f887e99bcbe in apr_pool_destroy () from /lib64/libapr-1.so.0 #7 0x00007f887e99bc9d in apr_pool_destroy () from /lib64/libapr-1.so.0 #8 0x000055e3318b876e in ap_read_request () #9 0x000055e3318da24b in ap_process_http_connection () #10 0x000055e3318a5dd8 in ap_run_process_connection () #11 0x00007f887e10d9a5 in process_socket () from /etc/httpd/modules/mod_mpm_event.so #12 0x00007f887e10e43a in worker_thread () from /etc/httpd/modules/mod_mpm_event.so #13 0x00007f887e69f822 in start_thread () from /lib64/libc.so.6 #14 0x00007f887e63f450 in clone3 () from /lib64/libc.so.6 I am not familiar with that mod_perl is doing here. But it seems to block when a HTTP/2 request/response has been handled and the allocated resources a freed. This free of a HTTP/2 stream (request/response) is done under lock of the "main" connection. And that prevents other requests on the same connection to progress. That is the lock waits you see in output filters. What mod_perl is waiting for here, I do not know. I have a production environment and a QA environment. In production, DBM data is stored in a tmpfs file system instead of hark disk. With a tmpfs file system, I am unable to reproduce deadlock using mod_perl's PerlResponseHandler ModPerl::Registry. I have been running ModPerl::Registry using DBM data on tmpfs file system in production for two days without a deadlock. If I implement storing DBM data in tmpfs file system in QA I do not see deadlock. If I revert to hard disk then I can reproduce deadlock within a couple of minutes. This implies to me that whatever Perl is waiting for in the deadlock situation is related to the speed of the storage subsystem. I located the source code for the modperl_tipool_pop() method. If the XXX: code below is executed, the method returns before releasing the modperl_tipool_lock(tipool) lock. modperl_list_t *modperl_tipool_pop(modperl_tipool_t *tipool) { modperl_list_t *head; modperl_tipool_lock(tipool); if (tipool->size == tipool->in_use) { if (tipool->size < tipool->cfg->max) { MP_TRACE_i(MP_FUNC, "no idle items, size %d < %d max\n", tipool->size, tipool->cfg->max); if (tipool->func->tipool_rgrow) { void * item = (*tipool->func->tipool_rgrow)(tipool, tipool->data); modperl_tipool_add(tipool, item); } } /* block until an item becomes available */ modperl_tipool_wait(tipool); } head = tipool->idle; tipool->idle = modperl_list_remove(tipool->idle, head); tipool->busy = modperl_list_append(tipool->busy, head); tipool->in_use++; /* XXX: this should never happen */ if (!head) { MP_TRACE_i(MP_FUNC, "PANIC: no items available, %d of %d in use\n", tipool->in_use, tipool->size); abort(); } modperl_tipool_unlock(tipool); return head; } I am still able to cause deadlock now even using RAM disk. I wrote the following script to analyze running httpd processes when I encounter deadlock. #!/bin/sh for pid in $(pgrep -u apache httpd); do pstack $pid | egrep 'lock|modperl_tipool' done Here is sample output when a deadlock occurs: #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #0 0x00007f1bf0c9c580 in __lll_lock_wait () from target:/lib64/libc.so.6 #1 0x00007f1bf0ca2c42 in pthread_mutex_lock@@GLIBC_2.2.5 () from target:/lib64/libc.so.6 #0 0x00007f1bf0c9c580 in __lll_lock_wait () from target:/lib64/libc.so.6 #1 0x00007f1bf0ca2c42 in pthread_mutex_lock@@GLIBC_2.2.5 () from target:/lib64/libc.so.6 #0 0x00007f1bf0c9c580 in __lll_lock_wait () from target:/lib64/libc.so.6 #1 0x00007f1bf0ca2c42 in pthread_mutex_lock@@GLIBC_2.2.5 () from target:/lib64/libc.so.6 #0 0x00007f1bf0c9c580 in __lll_lock_wait () from target:/lib64/libc.so.6 #1 0x00007f1bf0ca2c42 in pthread_mutex_lock@@GLIBC_2.2.5 () from target:/lib64/libc.so.6 #0 0x00007f1bf0c9c580 in __lll_lock_wait () from target:/lib64/libc.so.6 #1 0x00007f1bf0ca2c42 in pthread_mutex_lock@@GLIBC_2.2.5 () from target:/lib64/libc.so.6 #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so #2 0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so When httpd server is not deadlocked, this script produces no output. I updated the modperl_tipool_pop() method in modperl_tipool.c and moved the following line before the XXX: code section in the same file: modperl_tipool_unlock(tipool); After installing the newly compiled module and testing in FT, QA and production environments, I was unable to create a deadlock. In additional testing this morning, I was able to reproduce the deadlock on a slower system. I recompiled mod_perl.so with MP_DEBUG=1 and enabled PerlTrace i. For some reason I am only seeing three Perl interpreters starting and a max of five: [pid=2337140, tid=7fdb979e5900, perl=5594526d6150] modperl_tipool_init: start=3, max=5, min_spare=3, max_spare=3 I have the following lines in httpd.conf: PerlInterpStart 16 PerlInterpMax 128 PerlInterpMinSpare 16 I thought these lines would increase the max number of Perl interpeters. In looking at source, USE_ITHREADS needs to be #defined. I'm not sure how to confirm that it is being defined. I'm guessing that it is because apachectl configtest passes and httpd processes start. proxy::</home/seann/rpmbuild/SOURCES/mod_perl-2.0.12> perl -V:useithreads useithreads='define'; proxy::</home/seann/rpmbuild/SOURCES/mod_perl-2.0.12> perl -v This is perl 5, version 34, subversion 1 (v5.34.1) built for x86_64-linux-thread-multi Copyright 1987-2022, Larry Wall Perl may be copied only under the terms of either the Artistic License or the GNU General Public License, which may be found in the Perl 5 source kit. Complete documentation for Perl, including FAQ lists, should be found on this system using "man perl" or "perldoc perl". If you have access to the Internet, point your browser at http://www.perl.org/, the Perl Home Page. As I was reviewing source code for modperl_interp.c to determine how the Start, Max, Min interpreter settings are set, I ran across the following line in the source code: /* XXX: if no VirtualHosts w/ PerlOptions +Parent we can skip this */ Since I use Virtual hosts, I updated my httpd.conf config to: PerlOptions +Parent PerlInterpStart 16 PerlInterpMax 128 PerlInterpMinSpare 16 Using PerlOptions +Parent, the PerlTrace i output in error_log confirmed that my desired settings were actually being set correctly now. [pid=2384149, tid=7fb793835900, perl=55b886b7f470] modperl_tipool_init: start=16, max=128, min_spare=16, max_spare=3 So far, no deadlocks. I was able to run httpd server for a week without deadlock. Today I commented out the following line in httpd.conf: #PerlOptions +Parent and re-enabled debug logging and was able to reproduce deadlock within a couple of minutes. Process ID 155691 hung and I collected pstack output as well as Perl trace output from /var/log/httpd/error_log. I extracted only entries that matched process ID 155691 to add to this case. A little background on what I am doing in Perl. From the web page I am calling yearly.json?s=<symbol> script in parallel with 12 different symbols. The mod_perl script yearly.json pulls data out of a Berkley DB database using DBM::Deep CPAN module and returns it in JSON format. By commenting out the #PerlOptions +Parent line, the max number of Perl interpreters is 5 and mod_perl.so is susceptible to deadlocks in this configuration. My workaround for keeping mod_perl from deadlocking is the following Apache httpd.conf configuration: PerlOptions +Parent PerlInterpStart 16 PerlInterpMax 32 PerlInterpMinSpare 16 PerlInterpMaxSpare 16 Created attachment 38428 [details]
pstack output
Created attachment 38429 [details]
Perl trace output
I believe the following line of code in modperl_tipool.c is causing the problem: /* block until an item becomes available */ modperl_tipool_wait(tipool); This is a macro function defined in modperl_tipool.h as: #define modperl_tipool_wait(tipool) \ while (tipool->size == tipool->in_use) { \ MP_TRACE_i(MP_FUNC, \ "waiting for available tipool item in thread 0x%lx", \ MP_TIDF); \ MP_TRACE_i(MP_FUNC, "(%d items in use, %d alive)", \ tipool->in_use, tipool->size); \ COND_WAIT(&tipool->available, &tipool->tiplock); \ } I'm not familiar enough with thread pool management to identify if there is a logic error here. You can find these MP_TRACE_i messages in the attached error_log file. Comment on attachment 38429 [details]
Perl trace output
error_log output
|