Bug 66316

Summary: httpd hung at __lll_lock_wait / pthread_mutex_lock
Product: Apache httpd-2 Reporter: Seann Herdejurgen <seann>
Component: mod_http2Assignee: Apache HTTPD Bugs Mailing List <bugs>
Status: NEW ---    
Severity: major CC: deuba, seann
Priority: P2    
Version: 2.4.53   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Attachments: pstack output
pmap output
lsof output
pstack output
Perl trace output

Description Seann Herdejurgen 2022-10-17 11:54:38 UTC
Created attachment 38416 [details]
pstack output

Unable to select 2.4.54 version in bug report.

> rpm -qi httpd
Name        : httpd
Version     : 2.4.54
Release     : 1.fc35
Architecture: x86_64
Install Date: Wed 13 Jul 2022 09:34:02 AM MDT
Group       : Unspecified
Size        : 4933267
License     : ASL 2.0
Signature   : RSA/SHA256, Mon 27 Jun 2022 04:47:44 AM MDT, Key ID db4639719867c58f
Source RPM  : httpd-2.4.54-1.fc35.src.rpm
Build Date  : Mon 27 Jun 2022 04:15:49 AM MDT
Build Host  : buildvm-x86-30.iad2.fedoraproject.org
Packager    : Fedora Project
Vendor      : Fedora Project
URL         : https://httpd.apache.org/
Bug URL     : https://bugz.fedoraproject.org/httpd
Summary     : Apache HTTP Server
Description :
The Apache HTTP Server is a powerful, efficient, and extensible
web server.

I'm attempting to enable mod_perl under Apache and httpd server hangs when I attempt 12 simultaneous connections to the same .json (Perl) script to query data using DBM::Deep under PerlResponseHandler ModPerl::Registry. The same script runs fine as an ordinary CGI script. I ran httpd server in debug mode (httpd -X), reproduced the problem and collected pstack output. Five threads appeared to hang with the same stack trace. Here is the stack trace:

Thread 68 (Thread 0x7f884a7c4640 (LWP 1887571) "httpd"):
#0  0x00007f887e69c580 in __lll_lock_wait () from /lib64/libc.so.6
#1  0x00007f887e6a2c42 in pthread_mutex_lock@@GLIBC_2.2.5 () from /lib64/libc.so.6
#2  0x00007f887dba656d in open_output.lto_priv () from /etc/httpd/modules/mod_http2.so
#3  0x00007f887dba6afa in h2_filter_secondary_output.lto_priv () from /etc/httpd/modules/mod_http2.so
#4  0x00007f887e3800ca in cache_save_store () from /etc/httpd/modules/mod_cache.so
#5  0x00007f887e381bcc in cache_save_filter () from /etc/httpd/modules/mod_cache.so
#6  0x00007f887e295689 in deflate_out_filter () from /etc/httpd/modules/mod_deflate.so
#7  0x00007f887e2526de in filter_harness () from /etc/httpd/modules/mod_filter.so
#8  0x00007f887e390441 in compress_filter () from /etc/httpd/modules/mod_brotli.so
#9  0x00007f887e2526de in filter_harness () from /etc/httpd/modules/mod_filter.so
#10 0x000055e3318b1ed4 in end_output_stream.lto_priv ()
#11 0x000055e3318d9a86 in ap_process_async_request ()
#12 0x000055e3318d9e73 in ap_process_request ()
#13 0x00007f887dba7a02 in h2_task_process_conn () from /etc/httpd/modules/mod_http2.so
#14 0x000055e3318a5dd8 in ap_run_process_connection ()
#15 0x00007f887dbaa8b7 in slot_run () from /etc/httpd/modules/mod_http2.so
#16 0x00007f887e69f822 in start_thread () from /lib64/libc.so.6
#17 0x00007f887e63f450 in clone3 () from /lib64/libc.so.6

Present in the stack trace are mod_http2.so, mod_cache.so, mod_filter.so and mod_brotli.so. I'm unsure why mod_perl.so is missing from stack trace as that is the change I am introducing into the environment.

I also collected pmap and lsof output that I will add below. I'm wondering if there might be a memory leak because pmap output indicated process grew to 5719964K.
Comment 1 Seann Herdejurgen 2022-10-17 11:55:40 UTC
Created attachment 38417 [details]
pmap output
Comment 2 Seann Herdejurgen 2022-10-17 11:56:10 UTC
Created attachment 38418 [details]
lsof output
Comment 3 Stefan Eissing 2022-10-17 13:00:46 UTC
What I can see in the pstack output is many threads like this:

Thread 73 (Thread 0x7f8842ffd640 (LWP 1887577) "httpd"):
#0  0x00007f887e69c3ba in __futex_abstimed_wait_common () from /lib64/libc.so.6
#1  0x00007f887e69ebc0 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libc.so.6
#2  0x00007f887dbd6b73 in modperl_tipool_pop () from /etc/httpd/modules/mod_perl.so
#3  0x00007f887dbd6c8b in modperl_interp_get () from /etc/httpd/modules/mod_perl.so
#4  0x00007f887dbd6fce in modperl_interp_select () from /etc/httpd/modules/mod_perl.so
#5  0x00007f887dbde587 in modperl_config_req_cleanup () from /etc/httpd/modules/mod_perl.so
#6  0x00007f887e99bcbe in apr_pool_destroy () from /lib64/libapr-1.so.0
#7  0x00007f887e99bc9d in apr_pool_destroy () from /lib64/libapr-1.so.0
#8  0x000055e3318b876e in ap_read_request ()
#9  0x000055e3318da24b in ap_process_http_connection ()
#10 0x000055e3318a5dd8 in ap_run_process_connection ()
#11 0x00007f887e10d9a5 in process_socket () from /etc/httpd/modules/mod_mpm_event.so
#12 0x00007f887e10e43a in worker_thread () from /etc/httpd/modules/mod_mpm_event.so
#13 0x00007f887e69f822 in start_thread () from /lib64/libc.so.6
#14 0x00007f887e63f450 in clone3 () from /lib64/libc.so.6

I am not familiar with that mod_perl is doing here. But it seems to block when a HTTP/2 request/response has been handled and the allocated resources a freed.

This free of a HTTP/2 stream (request/response) is done under lock of the "main" connection. And that prevents other requests on the same connection to progress. That is the lock waits you see in output filters.

What mod_perl is waiting for here, I do not know.
Comment 4 Seann Herdejurgen 2022-10-19 17:09:36 UTC
I have a production environment and a QA environment. In production, DBM data is stored in a tmpfs file system instead of hark disk. With a tmpfs file system, I am unable to reproduce deadlock using mod_perl's PerlResponseHandler ModPerl::Registry. I have been running ModPerl::Registry using DBM data on tmpfs file system in production for two days without a deadlock.

If I implement storing DBM data in tmpfs file system in QA I do not see deadlock. If I revert to hard disk then I can reproduce deadlock within a couple of minutes.

This implies to me that whatever Perl is waiting for in the deadlock situation is related to the speed of the storage subsystem.
Comment 5 Seann Herdejurgen 2022-10-22 18:22:28 UTC
I located the source code for the modperl_tipool_pop() method. If the XXX: code below is executed, the method returns before releasing the modperl_tipool_lock(tipool) lock.

modperl_list_t *modperl_tipool_pop(modperl_tipool_t *tipool)
{
    modperl_list_t *head;

    modperl_tipool_lock(tipool);

    if (tipool->size == tipool->in_use) {
        if (tipool->size < tipool->cfg->max) {
            MP_TRACE_i(MP_FUNC,
                       "no idle items, size %d < %d max\n",
                       tipool->size, tipool->cfg->max);
            if (tipool->func->tipool_rgrow) {
                void * item =
                    (*tipool->func->tipool_rgrow)(tipool, tipool->data);

                modperl_tipool_add(tipool, item);
            }
        }
        /* block until an item becomes available */
        modperl_tipool_wait(tipool);
    }

    head = tipool->idle;

    tipool->idle = modperl_list_remove(tipool->idle, head);
    tipool->busy = modperl_list_append(tipool->busy, head);

    tipool->in_use++;

    /* XXX: this should never happen */
    if (!head) {
        MP_TRACE_i(MP_FUNC, "PANIC: no items available, %d of %d in use\n", 
                   tipool->in_use, tipool->size);
        abort();
    }

    modperl_tipool_unlock(tipool);

    return head;
}
Comment 6 Seann Herdejurgen 2022-10-22 18:45:48 UTC
I am still able to cause deadlock now even using RAM disk.

I wrote the following script to analyze running httpd processes when I encounter deadlock.

#!/bin/sh

for pid in $(pgrep -u apache httpd); do
   pstack $pid | egrep 'lock|modperl_tipool'
done

Here is sample output when a deadlock occurs:

#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#0  0x00007f1bf0c9c580 in __lll_lock_wait () from target:/lib64/libc.so.6
#1  0x00007f1bf0ca2c42 in pthread_mutex_lock@@GLIBC_2.2.5 () from target:/lib64/libc.so.6
#0  0x00007f1bf0c9c580 in __lll_lock_wait () from target:/lib64/libc.so.6
#1  0x00007f1bf0ca2c42 in pthread_mutex_lock@@GLIBC_2.2.5 () from target:/lib64/libc.so.6
#0  0x00007f1bf0c9c580 in __lll_lock_wait () from target:/lib64/libc.so.6
#1  0x00007f1bf0ca2c42 in pthread_mutex_lock@@GLIBC_2.2.5 () from target:/lib64/libc.so.6
#0  0x00007f1bf0c9c580 in __lll_lock_wait () from target:/lib64/libc.so.6
#1  0x00007f1bf0ca2c42 in pthread_mutex_lock@@GLIBC_2.2.5 () from target:/lib64/libc.so.6
#0  0x00007f1bf0c9c580 in __lll_lock_wait () from target:/lib64/libc.so.6
#1  0x00007f1bf0ca2c42 in pthread_mutex_lock@@GLIBC_2.2.5 () from target:/lib64/libc.so.6
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so
#2  0x00007f1bf03dcb73 in modperl_tipool_pop () from target:/etc/httpd/modules/mod_perl.so

When httpd server is not deadlocked, this script produces no output.
Comment 7 Seann Herdejurgen 2022-10-23 02:08:14 UTC
I updated the modperl_tipool_pop() method in modperl_tipool.c and moved the following line before the XXX: code section in the same file:

   modperl_tipool_unlock(tipool);

After installing the newly compiled module and testing in FT, QA and production environments, I was unable to create a deadlock.
Comment 8 Seann Herdejurgen 2022-10-23 11:51:51 UTC
In additional testing this morning, I was able to reproduce the deadlock on a slower system. I recompiled mod_perl.so with MP_DEBUG=1 and enabled PerlTrace i. For some reason I am only seeing three Perl interpreters starting and a max of five:

[pid=2337140, tid=7fdb979e5900, perl=5594526d6150] modperl_tipool_init: start=3, max=5, min_spare=3, max_spare=3

I have the following lines in httpd.conf:

    PerlInterpStart 16
    PerlInterpMax 128
    PerlInterpMinSpare 16

I thought these lines would increase the max number of Perl interpeters. In looking at source, USE_ITHREADS needs to be #defined. I'm not sure how to confirm that it is being defined. I'm guessing that it is because apachectl configtest passes and httpd processes start.

proxy::</home/seann/rpmbuild/SOURCES/mod_perl-2.0.12> perl -V:useithreads
useithreads='define';

proxy::</home/seann/rpmbuild/SOURCES/mod_perl-2.0.12> perl -v

This is perl 5, version 34, subversion 1 (v5.34.1) built for x86_64-linux-thread-multi

Copyright 1987-2022, Larry Wall

Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5 source kit.

Complete documentation for Perl, including FAQ lists, should be found on
this system using "man perl" or "perldoc perl".  If you have access to the
Internet, point your browser at http://www.perl.org/, the Perl Home Page.
Comment 9 Seann Herdejurgen 2022-10-24 19:59:48 UTC
As I was reviewing source code for modperl_interp.c to determine how the Start, Max, Min interpreter settings are set, I ran across the following line in the source code:

        /* XXX: if no VirtualHosts w/ PerlOptions +Parent we can skip this */

Since I use Virtual hosts, I updated my httpd.conf config to:

    PerlOptions +Parent
    PerlInterpStart 16
    PerlInterpMax 128
    PerlInterpMinSpare 16

Using PerlOptions +Parent, the PerlTrace i output in error_log confirmed that my desired settings were actually being set correctly now.

[pid=2384149, tid=7fb793835900, perl=55b886b7f470] modperl_tipool_init: start=16, max=128, min_spare=16, max_spare=3

So far, no deadlocks.
Comment 10 Seann Herdejurgen 2022-10-31 21:05:52 UTC
I was able to run httpd server for a week without deadlock.

Today I commented out the following line in httpd.conf:

   #PerlOptions +Parent

and re-enabled debug logging and was able to reproduce deadlock within a couple of minutes. Process ID 155691 hung and I collected pstack output as well as Perl trace output from /var/log/httpd/error_log. I extracted only entries that matched process ID 155691 to add to this case.

A little background on what I am doing in Perl. From the web page I am calling yearly.json?s=<symbol> script in parallel with 12 different symbols. The mod_perl script yearly.json pulls data out of a Berkley DB database using DBM::Deep CPAN module and returns it in JSON format. By commenting out the #PerlOptions +Parent line, the max number of Perl interpreters is 5 and mod_perl.so is susceptible to deadlocks in this configuration.

My workaround for keeping mod_perl from deadlocking is the following Apache httpd.conf configuration:

    PerlOptions +Parent
    PerlInterpStart 16
    PerlInterpMax 32
    PerlInterpMinSpare 16
    PerlInterpMaxSpare 16
Comment 11 Seann Herdejurgen 2022-10-31 21:07:21 UTC
Created attachment 38428 [details]
pstack output
Comment 12 Seann Herdejurgen 2022-10-31 21:08:02 UTC
Created attachment 38429 [details]
Perl trace output
Comment 13 Seann Herdejurgen 2022-10-31 21:27:22 UTC
I believe the following line of code in modperl_tipool.c is causing the problem:

        /* block until an item becomes available */
        modperl_tipool_wait(tipool);

This is a macro function defined in modperl_tipool.h as:

#define modperl_tipool_wait(tipool) \
    while (tipool->size == tipool->in_use) { \
        MP_TRACE_i(MP_FUNC, \
                   "waiting for available tipool item in thread 0x%lx", \
                   MP_TIDF); \
        MP_TRACE_i(MP_FUNC, "(%d items in use, %d alive)", \
                   tipool->in_use, tipool->size); \
        COND_WAIT(&tipool->available, &tipool->tiplock); \
    }

I'm not familiar enough with thread pool management to identify if there is a logic error here. You can find these MP_TRACE_i messages in the attached error_log file.
Comment 14 Seann Herdejurgen 2022-10-31 21:28:21 UTC
Comment on attachment 38429 [details]
Perl trace output

error_log output