Bug 45362

Summary:	Reverse Proxy Threads or Children spinning on CPU
Product:	Apache httpd-2	Reporter:	Mina Galić <me>
Component:	mod_proxy	Assignee:	Apache HTTPD Bugs Mailing List <bugs>
Status:	RESOLVED FIXED
Severity:	normal	CC:	chris, croller, esuran, marc.stern, perske
Priority:	P2
Version:	2.2.9
Target Milestone:	---
Hardware:	Sun
OS:	Solaris

Description Mina Galić 2008-07-08 02:16:06 UTC

Apache HTTPd with mpm worker in rev proxy mode, spinning on the CPU.

If I don't restart the server by myself, this will go on, I suppose until the the process reached MaxRequestsPerChild (set to the default value of 10000)

top - 09:17:07 up 95 days, 17:12,  4 users,  load average: 2.10, 1.57, 1.06
Tasks: 276 total,   5 running, 271 sleeping,   0 stopped,   0 zombie
Cpu(s):100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   2097152k total,  2063808k used,    33344k free,    49636k buffers
Swap:  2097144k total,    22524k used,  2074620k free,  1530100k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+   PPID COMMAND
19501 www       25   0  346m  92m  82m R 49.7  4.5  19:09.60 17149 httpd-worker
19498 www       25   0  346m  92m  82m R 49.3  4.5  25:40.51 17149 httpd-worker


Here's a backtrace of one of the offending LWPs:

metis:~# gdb -q --pid 19501
Attaching to process 19501

warning: process 19501 is a cloned process
Reading symbols from /opt/bw/bin/httpd-worker...done.
Using host libthread_db library "/lib/libthread_db.so.1".
BFD: /usr/lib/libpcre.so.3: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while mapping shared library sections:
"/usr/lib/libpcre.so.3": not in executable format: File format not recognized.
BFD: /usr/lib/libldap_r-2.4.so.2: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while mapping shared library sections:
"/usr/lib/libldap_r-2.4.so.2": not in executable format: File format not recognized.
BFD: /usr/lib/liblber-2.4.so.2: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while mapping shared library sections:
"/usr/lib/liblber-2.4.so.2": not in executable format: File format not recognized.
BFD: /lib/libnsl.so.1: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while mapping shared library sections:
"/lib/libnsl.so.1": not in executable format: File format not recognized.
BFD: /lib/libm.so.6: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while mapping shared library sections:
"/lib/libm.so.6": not in executable format: File format not recognized.
BFD: /usr/lib/libz.so.1: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while mapping shared library sections:
"/usr/lib/libz.so.1": not in executable format: File format not recognized.
BFD: /usr/lib/libsqlite3.so.0: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while mapping shared library sections:
"/usr/lib/libsqlite3.so.0": not in executable format: File format not recognized.
BFD: /lib/librt.so.1: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while mapping shared library sections:
"/lib/librt.so.1": not in executable format: File format not recognized.
BFD: /lib/libcrypt.so.1: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while mapping shared library sections:
"/lib/libcrypt.so.1": not in executable format: File format not recognized.
BFD: /lib/libpthread.so.0: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while mapping shared library sections:
"/lib/libpthread.so.0": not in executable format: File format not recognized.
BFD: /lib/libdl.so.2: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while mapping shared library sections:
"/lib/libdl.so.2": not in executable format: File format not recognized.
BFD: /lib/libc.so.6: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while mapping shared library sections:
"/lib/libc.so.6": not in executable format: File format not recognized.
BFD: /lib/libresolv.so.2: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while mapping shared library sections:
"/lib/libresolv.so.2": not in executable format: File format not recognized.
BFD: /usr/lib/libgnutls.so.26: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while mapping shared library sections:
"/usr/lib/libgnutls.so.26": not in executable format: File format not recognized.
BFD: /usr/lib/libssl.so.0.9.8: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while mapping shared library sections:
"/usr/lib/libssl.so.0.9.8": not in executable format: File format not recognized.
BFD: /usr/lib/libcrypto.so.0.9.8: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while mapping shared library sections:
"/usr/lib/libcrypto.so.0.9.8": not in executable format: File format not recognized.
BFD: /usr/lib/libkrb5.so.3: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while mapping shared library sections:
"/usr/lib/libkrb5.so.3": not in executable format: File format not recognized.
BFD: /lib64/ld-linux-x86-64.so.2: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while mapping shared library sections:
"/lib64/ld-linux-x86-64.so.2": not in executable format: File format not recognized.
BFD: /usr/lib/libgcrypt.so.11: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while mapping shared library sections:
"/usr/lib/libgcrypt.so.11": not in executable format: File format not recognized.
BFD: /usr/lib/libk5crypto.so.3: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while mapping shared library sections:
"/usr/lib/libk5crypto.so.3": not in executable format: File format not recognized.
BFD: /usr/lib/libkrb5support.so.0: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while mapping shared library sections:
"/usr/lib/libkrb5support.so.0": not in executable format: File format not recognized.
BFD: /lib/libnss_compat.so.2: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while mapping shared library sections:
"/lib/libnss_compat.so.2": not in executable format: File format not recognized.
BFD: /lib/libnss_nis.so.2: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while mapping shared library sections:
"/lib/libnss_nis.so.2": not in executable format: File format not recognized.
BFD: /lib/libnss_files.so.2: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while mapping shared library sections:
"/lib/libnss_files.so.2": not in executable format: File format not recognized.
BFD: /lib/libgcc_s.so.1: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while mapping shared library sections:
"/lib/libgcc_s.so.1": not in executable format: File format not recognized.
BFD: /usr/lib/libpcre.so.3: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while reading shared library symbols:
"/usr/lib/libpcre.so.3": can't read symbols: File format not recognized.
Reading symbols from /opt/bw/lib/libaprutil-1.so.0...done.
Loaded symbols for /opt/bw/lib/libaprutil-1.so.0
BFD: /usr/lib/libldap_r-2.4.so.2: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while reading shared library symbols:
"/usr/lib/libldap_r-2.4.so.2": can't read symbols: File format not recognized.
BFD: /usr/lib/liblber-2.4.so.2: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while reading shared library symbols:
"/usr/lib/liblber-2.4.so.2": can't read symbols: File format not recognized.
Reading symbols from /usr/lib/libpq.so.4...done.
Loaded symbols for /usr/lib/libpq.so.4
Reading symbols from /usr/lib/libmysqlclient_r.so.15...done.
Loaded symbols for /usr/lib/libmysqlclient_r.so.15
BFD: /lib/libnsl.so.1: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while reading shared library symbols:
"/lib/libnsl.so.1": can't read symbols: File format not recognized.
BFD: /lib/libm.so.6: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while reading shared library symbols:
"/lib/libm.so.6": can't read symbols: File format not recognized.
BFD: /usr/lib/libz.so.1: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while reading shared library symbols:
"/usr/lib/libz.so.1": can't read symbols: File format not recognized.
BFD: /usr/lib/libsqlite3.so.0: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while reading shared library symbols:
"/usr/lib/libsqlite3.so.0": can't read symbols: File format not recognized.
Reading symbols from /usr/lib/libexpat.so.1...done.
Loaded symbols for /usr/lib/libexpat.so.1
Reading symbols from /opt/bw/lib/libapr-1.so.0...done.
Loaded symbols for /opt/bw/lib/libapr-1.so.0
BFD: /lib/librt.so.1: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while reading shared library symbols:
"/lib/librt.so.1": can't read symbols: File format not recognized.
BFD: /lib/libcrypt.so.1: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while reading shared library symbols:
"/lib/libcrypt.so.1": can't read symbols: File format not recognized.
BFD: /lib/libpthread.so.0: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while reading shared library symbols:
"/lib/libpthread.so.0": can't read symbols: File format not recognized.
BFD: /lib/libdl.so.2: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while reading shared library symbols:
"/lib/libdl.so.2": can't read symbols: File format not recognized.
BFD: /lib/libc.so.6: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while reading shared library symbols:
"/lib/libc.so.6": can't read symbols: File format not recognized.
BFD: /lib/libresolv.so.2: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while reading shared library symbols:
"/lib/libresolv.so.2": can't read symbols: File format not recognized.
Reading symbols from /usr/lib/libsasl2.so.2...done.
Loaded symbols for /usr/lib/libsasl2.so.2
BFD: /usr/lib/libgnutls.so.26: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while reading shared library symbols:
"/usr/lib/libgnutls.so.26": can't read symbols: File format not recognized.
BFD: /usr/lib/libssl.so.0.9.8: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while reading shared library symbols:
"/usr/lib/libssl.so.0.9.8": can't read symbols: File format not recognized.
BFD: /usr/lib/libcrypto.so.0.9.8: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while reading shared library symbols:
"/usr/lib/libcrypto.so.0.9.8": can't read symbols: File format not recognized.
BFD: /usr/lib/libkrb5.so.3: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while reading shared library symbols:
"/usr/lib/libkrb5.so.3": can't read symbols: File format not recognized.
Reading symbols from /lib/libcom_err.so.2...done.
Loaded symbols for /lib/libcom_err.so.2
BFD: /lib/ld-linux-x86-64.so.2: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while reading shared library symbols:
"/lib/ld-linux-x86-64.so.2": can't read symbols: File format not recognized.
Reading symbols from /usr/lib/libtasn1.so.3...done.
Loaded symbols for /usr/lib/libtasn1.so.3
Reading symbols from /usr/lib/libgpg-error.so.0...done.
Loaded symbols for /usr/lib/libgpg-error.so.0
BFD: /usr/lib/libgcrypt.so.11: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while reading shared library symbols:
"/usr/lib/libgcrypt.so.11": can't read symbols: File format not recognized.
BFD: /usr/lib/libk5crypto.so.3: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while reading shared library symbols:
"/usr/lib/libk5crypto.so.3": can't read symbols: File format not recognized.
BFD: /usr/lib/libkrb5support.so.0: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while reading shared library symbols:
"/usr/lib/libkrb5support.so.0": can't read symbols: File format not recognized.
Reading symbols from /lib/libkeyutils.so.1...done.
Loaded symbols for /lib/libkeyutils.so.1
BFD: /lib/libnss_compat.so.2: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while reading shared library symbols:
"/lib/libnss_compat.so.2": can't read symbols: File format not recognized.
BFD: /lib/libnss_nis.so.2: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while reading shared library symbols:
"/lib/libnss_nis.so.2": can't read symbols: File format not recognized.
BFD: /lib/libnss_files.so.2: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while reading shared library symbols:
"/lib/libnss_files.so.2": can't read symbols: File format not recognized.
Reading symbols from /opt/bw/libexec/apache/mod_authn_file.so...done.
Loaded symbols for /opt/bw/libexec/apache/mod_authn_file.so
Reading symbols from /opt/bw/libexec/apache/mod_authz_host.so...done.
Loaded symbols for /opt/bw/libexec/apache/mod_authz_host.so
Reading symbols from /opt/bw/libexec/apache/mod_authz_groupfile.so...done.
Loaded symbols for /opt/bw/libexec/apache/mod_authz_groupfile.so
Reading symbols from /opt/bw/libexec/apache/mod_authz_user.so...done.
Loaded symbols for /opt/bw/libexec/apache/mod_authz_user.so
Reading symbols from /opt/bw/libexec/apache/mod_authz_default.so...done.
Loaded symbols for /opt/bw/libexec/apache/mod_authz_default.so
Reading symbols from /opt/bw/libexec/apache/mod_auth_basic.so...done.
Loaded symbols for /opt/bw/libexec/apache/mod_auth_basic.so
Reading symbols from /opt/bw/libexec/apache/mod_include.so...done.
Loaded symbols for /opt/bw/libexec/apache/mod_include.so
Reading symbols from /opt/bw/libexec/apache/mod_charset_lite.so...done.
Loaded symbols for /opt/bw/libexec/apache/mod_charset_lite.so
Reading symbols from /opt/bw/libexec/apache/mod_env.so...done.
Loaded symbols for /opt/bw/libexec/apache/mod_env.so
Reading symbols from /opt/bw/libexec/apache/mod_mime_magic.so...done.
Loaded symbols for /opt/bw/libexec/apache/mod_mime_magic.so
Reading symbols from /opt/bw/libexec/apache/mod_expires.so...done.
Loaded symbols for /opt/bw/libexec/apache/mod_expires.so
Reading symbols from /opt/bw/libexec/apache/mod_headers.so...done.
Loaded symbols for /opt/bw/libexec/apache/mod_headers.so
Reading symbols from /opt/bw/libexec/apache/mod_setenvif.so...done.
Loaded symbols for /opt/bw/libexec/apache/mod_setenvif.so
Reading symbols from /opt/bw/libexec/apache/mod_mime.so...done.
Loaded symbols for /opt/bw/libexec/apache/mod_mime.so
Reading symbols from /opt/bw/libexec/apache/mod_autoindex.so...done.
Loaded symbols for /opt/bw/libexec/apache/mod_autoindex.so
Reading symbols from /opt/bw/libexec/apache/mod_negotiation.so...done.
Loaded symbols for /opt/bw/libexec/apache/mod_negotiation.so
Reading symbols from /opt/bw/libexec/apache/mod_dir.so...done.
Loaded symbols for /opt/bw/libexec/apache/mod_dir.so
Reading symbols from /opt/bw/libexec/apache/mod_alias.so...done.
Loaded symbols for /opt/bw/libexec/apache/mod_alias.so
Reading symbols from /opt/bw/libexec/apache/mod_rewrite.so...done.
Loaded symbols for /opt/bw/libexec/apache/mod_rewrite.so
Reading symbols from /opt/bw/libexec/apache/mod_proxy.so...done.
Loaded symbols for /opt/bw/libexec/apache/mod_proxy.so
Reading symbols from /opt/bw/libexec/apache/mod_proxy_http.so...done.
Loaded symbols for /opt/bw/libexec/apache/mod_proxy_http.so
Reading symbols from /opt/bw/libexec/apache/mod_proxy_connect.so...done.
Loaded symbols for /opt/bw/libexec/apache/mod_proxy_connect.so
Reading symbols from /opt/bw/libexec/apache/mod_deflate.so...done.
Loaded symbols for /opt/bw/libexec/apache/mod_deflate.so
Reading symbols from /opt/bw/libexec/apache/mod_log_config.so...done.
Loaded symbols for /opt/bw/libexec/apache/mod_log_config.so
Reading symbols from /opt/bw/libexec/apache/mod_ssl.so...done.
Loaded symbols for /opt/bw/libexec/apache/mod_ssl.so
BFD: /lib/libgcc_s.so.1: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]
Error while reading shared library symbols:
"/lib/libgcc_s.so.1": can't read symbols: File format not recognized.
BFD: /lib64/ld-linux-x86-64.so.2: don't know how to handle OS specific section `.gnu.hash' [0x6ffffff6]

warning: Unable to grok dynamic linker /lib64/ld-linux-x86-64.so.2 as an object file


warning: Unable to find dynamic linker breakpoint function.
GDB will be unable to debug shared library initializers
and track explicitly loaded dynamic code.
ap_http_outerror_filter (f=0xb67220, b=0xb68180) at http_filters.c:1605
1605    http_filters.c: No such file or directory.
        in http_filters.c
(gdb) bt
#0  ap_http_outerror_filter (f=0xb67220, b=0xb68180) at http_filters.c:1605
#1  0x000000000044176b in ap_http_header_filter (f=0xb671f8, b=0xb68180) at http_filters.c:1302
#2  0x0000000000428cfe in ap_content_length_filter (f=0xb671d0, b=0xb68180) at protocol.c:1338
#3  0x0000000000442c34 in ap_byterange_filter (f=0xb671a8, bb=0xb68180) at byterange_filter.c:177
#4  0x00002b2050b03199 in deflate_out_filter (f=0xb67a28, bb=0xb68180) at mod_deflate.c:486
#5  0x00002b20506f8f41 in ap_proxy_http_process_response (p=0xb5dc28, r=0xb65cd8, backend=0xd164d8, origin=0xcd8798, conf=0x8571d8,
    server_portstr=0x45809da0 "") at mod_proxy_http.c:1763
#6  0x00002b20506fb06e in proxy_http_handler (r=0xb65cd8, worker=<value optimized out>, conf=0x8571d8,
    url=0xb67db8 "/blog/contents/2008/June/thafirst520/bocage_closer/thumbnails/i400.jpg", proxyname=0x0, proxyport=<value optimized out>)
    at mod_proxy_http.c:1953
#7  0x00002b20504e7fe0 in proxy_run_scheme_handler (r=0xb65cd8, worker=0x8573a8, conf=0x8571d8,
    url=0xb679be "http://127.0.0.1:8006/blog/contents/2008/June/thafirst520/bocage_closer/thumbnails/i400.jpg", proxyhost=0x0,
    proxyport=<value optimized out>) at mod_proxy.c:2345
#8  0x00002b20504ec07c in proxy_handler (r=0xb65cd8) at mod_proxy.c:969
#9  0x00000000004332c9 in ap_run_handler (r=0xb65cd8) at config.c:157
#10 0x00000000004363ac in ap_invoke_handler (r=0xb65cd8) at config.c:372
#11 0x00000000004406e8 in ap_process_request (r=0xb65cd8) at http_request.c:258
#12 0x000000000043db8c in ap_process_http_connection (c=0xb5dec0) at http_core.c:190
#13 0x0000000000439f61 in ap_run_process_connection (c=0xb5dec0) at connection.c:43
#14 0x0000000000445026 in worker_thread (thd=0x7f1c88, dummy=<value optimized out>) at worker.c:544
#15 0x00002b204be24017 in ?? ()
#16 0x0000000000000000 in ?? ()
(gdb) bt full
#0  ap_http_outerror_filter (f=0xb67220, b=0xb68180) at http_filters.c:1605
        r = (request_rec *) 0xb65cd8
        ctx = (outerror_filter_ctx_t *) 0xd750c8
        e = (apr_bucket *) 0xbc3e98
#1  0x000000000044176b in ap_http_header_filter (f=0xb671f8, b=0xb68180) at http_filters.c:1302
        r = (request_rec *) 0xb65cd8
        c = (conn_rec *) 0xb5dec0
        clheader = <value optimized out>
        protocol = 0x449833 "HTTP/1.1"
        e = <value optimized out>
        b2 = (apr_bucket_brigade *) 0xd74e88
        h = {pool = 0xb65c68, bb = 0xd74e88}
        ctx = (header_filter_ctx *) 0x0
        ctype = 0x200 <Address 0x200 out of bounds>
        eb = <value optimized out>
#2  0x0000000000428cfe in ap_content_length_filter (f=0xb671d0, b=0xb68180) at protocol.c:1338
        split = (apr_bucket_brigade *) 0xb671a8
        flush = <value optimized out>
        len = 206158430256
        ignored = 0x2b204f4b8032 "H\205�u��4���I\213u\bH\213|$\020�f���H\205�I\211�\017\205�"
        rv = 7702
        r = (request_rec *) 0xb65cd8
        ctx = (struct content_length_ctx *) 0xd74e00
        e = <value optimized out>
        eblock = 14110920
#3  0x0000000000442c34 in ap_byterange_filter (f=0xb671a8, bb=0xb68180) at byterange_filter.c:177
        r = (request_rec *) 0xb65cd8
        c = (conn_rec *) 0xb5dec0
        e = <value optimized out>
        bsend = <value optimized out>
        range_start = <value optimized out>
        range_end = <value optimized out>
        current = <value optimized out>
        clength = 7702
        rv = <value optimized out>
        found = <value optimized out>
        num_ranges = <value optimized out>
#4  0x00002b2050b03199 in deflate_out_filter (f=0xb67a28, bb=0xb68180) at mod_deflate.c:486
        tmp = 0x1e16 <Address 0x1e16 out of bounds>
        token = <value optimized out>
        encoding = 0x0
        e = <value optimized out>
        r = (request_rec *) 0xb65cd8
        ctx = (deflate_ctx *) 0x0
        zRC = 0
        c = (deflate_filter_config *) 0x6e7280
#5  0x00002b20506f8f41 in ap_proxy_http_process_response (p=0xb5dc28, r=0xb65cd8, backend=0xd164d8, origin=0xcd8798, conf=0x8571d8,
    server_portstr=0x45809da0 "") at mod_proxy_http.c:1763
        readbytes = 7702
        rv = <value optimized out>
        mode = <value optimized out>
        finish = 0
        c = (conn_rec *) 0xb5dec0
        buffer = "\000\nntent-Type\000 image/jpeg\000\n=100\000\n\"\000\n20:31 GMT\000\nt\000\n.php\000\n\000�\215�\000\000\000\000\000X\216�\000\000\000\000\000\200\t.J +\000\000\000�A", '\0' <repeats 13 times>, "\001\000\000\000\000\000\000\000\023�|K +\000\000�\215�\000\000\000\000\000X\216�\000\000\000\000\000�\215�\000\000\000\000\000�\215�\000\000\000\000\000\230<�\000\000\000\000\000�\027.J +\000\000�\207�\000\000\000\000\000g�/L +\000\000`-�\000\000\000\000\000\001\000\000\000\000\000\000\000�\000\000\000\000\000�\236\200"...
        buf = <value optimized out>
        keepchar = <value optimized out>
        rp = (request_rec *) 0xb61f18
        e = <value optimized out>
        bb = (apr_bucket_brigade *) 0xb68180
        tmp_bb = (apr_bucket_brigade *) 0xb681a0
        len = 15
        backasswards = <value optimized out>
        interim_response = 0
        pread_len = 0
        save_table = (apr_table_t *) 0xb63858
        backend_broke = 0
        te = 0x0
        hop_by_hop_hdrs = {0x2b20506fc44f "Keep-Alive", 0x2b20506fc77c "Proxy-Authenticate", 0x2b20506fc65a "TE", 0x2b20506fc65d "Trailer",
  0x2b20506fc665 "Upgrade", 0x0}
#6  0x00002b20506fb06e in proxy_http_handler (r=0xb65cd8, worker=<value optimized out>, conf=0x8571d8,
    url=0xb67db8 "/blog/contents/2008/June/thafirst520/bocage_closer/thumbnails/i400.jpg", proxyname=0x0, proxyport=<value optimized out>)
    at mod_proxy_http.c:1953
        data = 0x45809ef0 "xz�"
        bytes_read = 0
        bytes_written = 1166056976
        status = 0
        server_portstr = "\000\000\000\0000\000\000\000�\236\200E\000\000\000\000�\235\200E\000\000\000\0000\236\200E\000\000\000"
        scheme = <value optimized out>
        proxy_function = 0x2b20506fc5b8 "HTTP"
        backend = (proxy_conn_rec *) 0xd164d8
        is_ssl = 0
        p = (apr_pool_t *) 0xb5dc28
        uri = <value optimized out>
#7  0x00002b20504e7fe0 in proxy_run_scheme_handler (r=0xb65cd8, worker=0x8573a8, conf=0x8571d8,
    url=0xb679be "http://127.0.0.1:8006/blog/contents/2008/June/thafirst520/bocage_closer/thumbnails/i400.jpg", proxyhost=0x0,
    proxyport=<value optimized out>) at mod_proxy.c:2345
        n = 0
        rv = <value optimized out>
#8  0x00002b20504ec07c in proxy_handler (r=0xb65cd8) at mod_proxy.c:969
        url = 0xb679be "http://127.0.0.1:8006/blog/contents/2008/June/thafirst520/bocage_closer/thumbnails/i400.jpg"
        uri = 0xb679be "http://127.0.0.1:8006/blog/contents/2008/June/thafirst520/bocage_closer/thumbnails/i400.jpg"
        scheme = 0xb67a78 "http"
        p = 0x4 <Address 0x4 out of bounds>
        p2 = 0xb5df40 "\230�"
        conf = (proxy_server_conf *) 0x8571d8
        proxies = (apr_array_header_t *) 0x8572a8
        i = 7172792
        access_status = 0
        direct_connect = <value optimized out>
        str = <value optimized out>
        maxfwd = <value optimized out>
        balancer = (proxy_balancer *) 0x0
        worker = (proxy_worker *) 0x8573a8
        attempts = 0
        max_attempts = 0
#9  0x00000000004332c9 in ap_run_handler (r=0xb65cd8) at config.c:157
        n = 0
        rv = 11960712
#10 0x00000000004363ac in ap_invoke_handler (r=0xb65cd8) at config.c:372
        handler = 0x861568 "\200�C"
        result = 0
        old_handler = 0x2b20504f22b5 "proxy-server"
        ignore = <value optimized out>
#11 0x00000000004406e8 in ap_process_request (r=0xb65cd8) at http_request.c:258
        access_status = 1246708896
#12 0x000000000043db8c in ap_process_http_connection (c=0xb5dec0) at http_core.c:190
        r = (request_rec *) 0xb65cd8
        csd = (apr_socket_t *) 0xb5dca8
#13 0x0000000000439f61 in ap_run_process_connection (c=0xb5dec0) at connection.c:43
        n = 0
        rv = 11960712
#14 0x0000000000445026 in worker_thread (thd=0x7f1c88, dummy=<value optimized out>) at worker.c:544
        process_slot = 1
        thread_slot = 9
        csd = (apr_socket_t *) 0xb5dca8
        bucket_alloc = (apr_bucket_alloc_t *) 0xbcddd8
        last_ptrans = <value optimized out>
        ptrans = (apr_pool_t *) 0xb5dc28
        rv = <value optimized out>
        is_idle = <value optimized out>
#15 0x00002b204be24017 in ?? ()
No symbol table info available.
#16 0x0000000000000000 in ?? ()
No symbol table info available.
(gdb)

Comment 1 Ruediger Pluem 2008-07-08 06:17:04 UTC

Can you please step further through with gdb to see the limits of the loop (so where we are really looping on)?

Comment 2 Mina Galić 2008-07-08 14:25:07 UTC

I have now managed to draw a couple of (22 so far) backtraces of the running processes or rather LWPs. But I have not managed to get a 'next':

for lwppid in $( ps -Lef | grep -v grep | grep $( < /var/opt/bw/apache/proxy/proxy.pid )| awk '{print $4}' | xargs ); do  gdb --batch --eval-command="where" -ex "bt full" --pid=$lwppid /opt/bw/bin/httpd-worker > $lwppid.$(date +"%Y%m%d%H%M%S") 2>/dev/null; done


Some of the backtraces for the same LWP are equal, which shows that the state doesn't change for a very long time.
I'm not quite sure how to proceede next, if I shouldn't just upload the stacktraces. NEEDINFO.

Comment 3 Mina Galić 2008-07-08 14:32:27 UTC

An additional info from my side:
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+   PPID WCHAN     COMMAND
24151 www       25   0  168m  26m  19m R 97.7  1.3 232:08.94 17149 stext     httpd-worker

This thread has now been for quite a long time in this very same state (it's currently my main source of backtraces and frustration).

Comment 4 Ruediger Pluem 2008-07-08 23:45:07 UTC

Can you try to trace the rogue process with ltrace?

Comment 5 Mina Galić 2008-07-10 02:14:48 UTC

Updates: After rpluem asked me to ltrace, I tried, but failed in achieving anything meaningful. ltracing the parent I'm getting stuff like:
              
11:21:25.046949 SYS_wait4(-1, 0x7fff60c1555c, 3, 0, 0x667168)                                    = 0
11:21:25.047094 <... apr_proc_wait_all_procs resumed> )                                          = 70006
11:21:25.047157 apr_sleep(0xf4240, 0x7fff60c1555c, 70006, -1, 0x667168 <unfinished ...>
11:21:25.047260 SYS_select(0, 0, 0, 0, 0x7fff60c15570)                                           = 0
11:21:26.046765 <... apr_sleep resumed> )                                                        = 0
11:21:26.046874 apr_proc_wait_all_procs(0x7fff60c15690, 0x7fff60c156b8, 0x7fff60c156bc, 1, 0x667168 <unfinished ...>
11:21:26.046997 SYS_wait4(-1, 0x7fff60c1555c, 3, 0, 0x667168)                                    = 0
11:21:26.047141 <... apr_proc_wait_all_procs resumed> )                                          = 70006
11:21:26.047204 apr_sleep(0xf4240, 0x7fff60c1555c, 70006, -1, 0x667168 <unfinished ...>
11:21:26.047307 SYS_select(0, 0, 0, 0, 0x7fff60c15570)                                           = 0
11:21:27.046849 <... apr_sleep resumed> )                                                        = 0
11:21:27.046950 apr_proc_wait_all_procs(0x7fff60c15690, 0x7fff60c156b8, 0x7fff60c156bc, 1, 0x667168 <unfinished ...>
11:21:27.047068 SYS_wait4(-1, 0x7fff60c1555c, 3, 0, 0x667168)                                    = 0
11:21:27.047230 <... apr_proc_wait_all_procs resumed> )                                          = 70006
11:21:27.047296 apr_sleep(0xf4240, 0x7fff60c1555c, 70006, -1, 0x667168 <unfinished ...>
11:21:27.047402 SYS_select(0, 0, 0, 0, 0x7fff60c15570)                                           = 0
11:21:28.046897 <... apr_sleep resumed> )                                                        = 0
11:21:28.047003 apr_proc_wait_all_procs(0x7fff60c15690, 0x7fff60c156b8, 0x7fff60c156bc, 1, 0x667168 <unfinished ...>
11:21:28.047126 SYS_wait4(-1, 0x7fff60c1555c, 3, 0, 0x667168)                                    = 0
11:21:28.047273 <... apr_proc_wait_all_procs resumed> )                                          = 70006
11:21:28.047335 apr_sleep(0xf4240, 0x7fff60c1555c, 70006, -1, 0x667168 <unfinished ...>
11:21:28.047439 SYS_select(0, 0, 0, 0, 0x7fff60c15570)                                           = 0
11:21:29.046945 <... apr_sleep resumed> )                                                        = 0
11:21:29.047046 apr_proc_wait_all_procs(0x7fff60c15690, 0x7fff60c156b8, 0x7fff60c156bc, 1, 0x667168 <unfinished ...>
11:21:29.047164 SYS_wait4(-1, 0x7fff60c1555c, 3, 0, 0x667168)                                    = 0
11:21:29.047323 <... apr_proc_wait_all_procs resumed> )                                          = 70006
11:21:29.047389 apr_sleep(0xf4240, 0x7fff60c1555c, 70006, -1, 0x667168 <unfinished ...>
11:21:29.047492 SYS_select(0, 0, 0, 0, 0x7fff60c15570)                                           = 0

which seems natural.
Now, if I try to ltrace a LWP, I'm hitting a limitation of ltrace itself: http://lists.alioth.debian.org/pipermail/ltrace-devel/2006-April/000039.html


After ca 36~ hours of runtime: I finally got: 

top - 12:16:45 up 96 days, 20:11,  4 users,  load average: 0.00, 0.01, 0.25
Tasks: 253 total,   2 running, 219 sleeping,  30 stopped,   2 zombie
Cpu(s):  0.0%us,  0.3%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   2097152k total,  1903600k used,   193552k free,   299744k buffers
Swap:  2097144k total,    18924k used,  2078220k free,  1072724k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+   PPID nDRT WCHAN     COMMAND
24125 www       15   0  168m  26m  19m T  0.0  1.3   0:00.00 17149    0 finish_st httpd-worker
24151 www       18   0  168m  26m  19m T  0.0  1.3 845:53.74 17149    0 finish_st httpd-worker
 5136 www       18   0  345m  11m 2728 T  0.0  0.6   0:00.00 17149    0 finish_st httpd-worker
 5138 www       15   0  345m  11m 2728 T  0.0  0.6   0:00.13 17149    0 finish_st httpd-worker
 5139 www       15   0  345m  11m 2728 T  0.0  0.6   0:00.10 17149    0 finish_st httpd-worker
 5140 www       15   0  345m  11m 2728 T  0.0  0.6   0:00.12 17149    0 finish_st httpd-worker
 5141 www       15   0  345m  11m 2728 T  0.0  0.6   0:00.13 17149    0 finish_st httpd-worker
 5142 www       15   0  345m  11m 2728 T  0.0  0.6   0:00.10 17149    0 finish_st httpd-worker
 5143 www       15   0  345m  11m 2728 T  0.0  0.6   0:00.08 17149    0 finish_st httpd-worker
 5144 www       15   0  345m  11m 2728 T  0.0  0.6   0:00.08 17149    0 finish_st httpd-worker
 5145 www       15   0  345m  11m 2728 T  0.0  0.6   0:00.08 17149    0 finish_st httpd-worker
 5146 www       15   0  345m  11m 2728 T  0.0  0.6   0:00.07 17149    0 finish_st httpd-worker
 5147 www       15   0  345m  11m 2728 T  0.0  0.6   0:00.60 17149    0 finish_st httpd-worker
 5148 www       15   0  345m  11m 2728 T  0.0  0.6   0:00.11 17149    0 finish_st httpd-worker
 5149 www       15   0  345m  11m 2728 T  0.0  0.6   0:00.15 17149    0 finish_st httpd-worker
 5150 www       15   0  345m  11m 2728 T  0.0  0.6   0:00.11 17149    0 finish_st httpd-worker
 5151 www       15   0  345m  11m 2728 T  0.0  0.6   0:00.11 17149    0 finish_st httpd-worker
 5152 www       15   0  345m  11m 2728 T  0.0  0.6   0:00.12 17149    0 finish_st httpd-worker
 5153 www       15   0  345m  11m 2728 T  0.0  0.6   0:00.10 17149    0 finish_st httpd-worker
 5154 www       15   0  345m  11m 2728 T  0.0  0.6   0:00.14 17149    0 finish_st httpd-worker
 5155 www       15   0  345m  11m 2728 T  0.0  0.6   0:00.14 17149    0 finish_st httpd-worker
 5156 www       15   0  345m  11m 2728 T  0.0  0.6   0:00.16 17149    0 finish_st httpd-worker
 5157 www       15   0  345m  11m 2728 T  0.0  0.6   0:00.15 17149    0 finish_st httpd-worker
 5158 www       15   0  345m  11m 2728 T  0.0  0.6   0:00.08 17149    0 finish_st httpd-worker
 5159 www       15   0  345m  11m 2728 T  0.0  0.6   0:00.13 17149    0 finish_st httpd-worker
 5160 www       15   0  345m  11m 2728 T  0.0  0.6   0:00.12 17149    0 finish_st httpd-worker
 5161 www       18   0  345m  11m 2728 T  0.0  0.6  51:08.45 17149    0 finish_st httpd-worker
 5162 www       15   0  345m  11m 2728 T  0.0  0.6   0:00.10 17149    0 finish_st httpd-worker
 5163 www       15   0  345m  11m 2728 T  0.0  0.6   0:00.21 17149    0 finish_st httpd-worker
 5902 www       17   0     0    0    0 Z  0.0  0.0   0:04.58 17149    0 exit      httpd-worker <defunct>
 6251 www       17   0     0    0    0 Z  0.0  0.0   0:00.21 17149    0 exit      httpd-worker <defunct>


No longer LISTENing (according to netstat), not reacting to apachectl stop, or kill for that matter. I had to kill -9 it.

Comment 6 Mina Galić 2008-07-14 09:34:08 UTC

updates!
I have managed to reproduce the bug with mpm prefork (on the same system) -- and, to a certain extent with mpm worker (as of yet) on Open Solaris:


top - 16:32:13 up 102 days, 43 min,  3 users,  load average: 1.09, 1.16, 1.10
Tasks: 234 total,   3 running, 231 sleeping,   0 stopped,   0 zombie
Cpu(s): 99.7%us,  0.3%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   2097152k total,  1970148k used,   127004k free,   307868k buffers
Swap:  2097144k total,    18676k used,  2078468k free,  1198244k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+   PPID nDRT WCHAN     COMMAND
25675 www       25   0  122m  12m 9300 R 97.7  0.6  50:31.07 22818    0 stext     httpd-prefork
27246 www       15   0  122m 5280 1536 S  0.0  0.3   0:00.03 22818    0 184466121 httpd-prefork
27251 www       15   0  123m 6216 2356 S  0.0  0.3   0:00.09 22818    0 semtimedo httpd-prefork
27286 www       15   0  122m 5268 1536 S  0.0  0.3   0:00.02 22818    0 360474278 httpd-prefork
27287 www       15   0  123m 6424 2356 S  0.0  0.3   0:00.04 22818    0 semtimedo httpd-prefork
27288 www       15   0  122m 5268 1536 S  0.0  0.3   0:00.00 22818    0 semtimedo httpd-prefork
27290 www       15   0  123m 5512 1536 S  0.0  0.3   0:00.01 22818    0 semtimedo httpd-prefork
27318 www       15   0  122m 5248 1536 S  0.0  0.3   0:00.00 22818    0 semtimedo httpd-prefork
27344 www       15   0  122m 5096 1520 S  0.0  0.2   0:00.02 22818    0 semtimedo httpd-prefork
27354 www       15   0  122m 5088 1520 S  0.0  0.2   0:00.00 22818    0 semtimedo httpd-prefork
27356 www       15   0  123m 5392 1536 S  0.0  0.3   0:00.00 22818    0 semtimedo httpd-prefork
27361 www       15   0  123m 5456 1532 S  0.0  0.3   0:00.00 22818    0 211816788 httpd-prefork


On Solaris I only have a zombified process (LWP), which won't go away unless killed -9, but it's not eating 80+ CPU:

   PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/LWPID
 16635 webservd 9488K 3312K sleep   59    0   0:00:00 0.0% httpd-worker/23
[/snip]
 16635 webservd 9488K 3312K sleep   59    0   0:00:00 0.0% httpd-worker/3
 16635 webservd 9488K 3312K zombie   0    -   0:00:00 0.0% httpd-worker/2
 16635 webservd 9488K 3312K sleep   59    0   0:00:00 0.0% httpd-worker/1
[/snip]
Total: 3 processes, 84 lwps, load averages: 0.02, 0.01, 0.01

Comment 7 Chris Rhodes 2008-10-10 12:38:17 UTC

I'm experiencing the same symptoms as Igor (single httpd sucking up 90% cpu time) using 2.2.10 test release on Opensuse 10.3 x86_64 and have a couple of observations to add.

I run worker in a pure reverse proxy scenario (built with included apr) with one major difference than Igor being that I do not use mod_deflate at all.

First observation is that the out of control process had three sockets open and they were all in CLOSE_WAIT hoping to eventually be released.

Second observation is that after defining "disablereuse on" for every proxypass, the problem has not reoccurred.  In addition errors of the type "(70014)End of file found: proxy: error reading status line from remote server"  have disappeared.  

Happy to provide any information that I can to help on this, just tell me what is needed.

Comment 8 Greg 2008-10-24 09:08:18 UTC

Has anyone found a fix for this?  We have the exact same problem with 2.2.9 and also 2.2.10.  I tried the workaround for disablereuse, but mod_proxy_balancer does not support this in the ProxyPass directive.

Comment 9 todd_643 2009-06-23 09:58:29 UTC

I am experiencing similar problem on Win2K3 with Apache 2.2.8 and 2.2.11.  I am using mod_proxy and mod_rewrite where mod_rewrite enables the PROXY flag

Comment 10 Peter Boros 2010-03-29 12:11:28 UTC

We ran into this bug with apache 2.2.9 on Solaris 10. Here are the stack traces:
# pstack 19235
19235:  /opt/coolstack/apache2/bin/httpd -k start
 ff236b1c ???????? (dafa0, 87c88, 87c88, 455518, 1b7788, 87d50)
 fe2ec680 connection_cleanup (dafe0, dafe0, 7, 96bd0, 8fc20, 974c8) + 148
 fe2ed0a8 ap_proxy_release_connection (fe2c5fb8, dafe0, 97440, 10400, fe300a0c, 96bd0) + 4c
 fe2c5064 proxy_http_handler (fffffffe, dafe0, ec318, ffbf9694, fe2c5fb8, 1ac660) + 3e8
 fe2e508c proxy_handler (fe2c4c7c, fe3018b8, 1f78c8, fe2eec9c, ec3e0, ec318) + 728
 0003fe60 ap_run_handler (1f78c8, 4, 826ac, 826c8, 9f330, 0) + 3c
 00040478 ap_invoke_handler (1f78c8, 9f450, 0, 0, 0, fe2ee1c8) + e8
 0004fcbc ap_process_request (0, 80000, 1f78c8, 0, 0, 1f78c8) + 54
 0004ca68 ap_process_http_connection (1999e8, 1f78c8, 8fc20, 80000, 801b8, 80000000) + 6c
 000482f8 ap_run_process_connection (1999e8, 199750, 9ef88, 82ee4, 9f760, 0) + 3c
 0005505c child_main (22, 1999e8, 6b75c, 82f30, 803bc, 1) + 64c
 000553b0 make_child (54400, 33, 0, 0, 34, 80000) + 13c
 00055654 perform_idle_server_maintenance (80000, 57, 19, 803ac, 8337c, 4) + 1d0
 00055d54 ap_mpm_run (80000, 0, ffffffff, 27, 803c4, 82f40) + 678
 0002928c main     (8bdb0, ffbffd08, 8023c, 7f090, 64650, 7f080) + 9f4
 00028378 _start   (0, 0, 0, 0, 0, 0) + 108

# mdb -p 19235

Loading modules: [ ld.so.1 libc.so.1 libuutil.so.1 ]
> $C
ffbf9498 libapr-1.so.0.3.0`apr_thread_mutex_unlock+0xc(dafa0, 87c88, 87c88, 455518, 1b7788, 87d50)
ffbf94f8 mod_proxy.so`connection_cleanup+0x148(dafe0, dafe0, 7, 96bd0, 8fc20, 974c8)
ffbf9560 mod_proxy.so`ap_proxy_release_connection+0x4c(fe2c5fb8, dafe0, 97440, 10400, fe300a0c, 96bd0)
ffbf95c8 mod_proxy_http.so`proxy_http_handler+0x3e8(fffffffe, dafe0, ec318, ffbf9694, fe2c5fb8, 1ac660)
ffbf96b8 mod_proxy.so`proxy_handler+0x728(fe2c4c7c, fe3018b8, 1f78c8, fe2eec9c, ec3e0, ec318)
ffbf9790 httpd`ap_run_handler+0x3c(1f78c8, 4, 826ac, 826c8, 9f330, 0)
ffbf97f0 httpd`ap_invoke_handler+0xe8(1f78c8, 9f450, 0, 0, 0, fe2ee1c8)
ffbf9858 httpd`ap_process_request+0x54(0, 80000, 1f78c8, 0, 0, 1f78c8)
ffbf98b8 httpd`ap_process_http_connection+0x6c(1999e8, 1f78c8, 8fc20, 80000, 801b8, 80000000)
ffbf9920 httpd`ap_run_process_connection+0x3c(1999e8, 199750, 9ef88, 82ee4, 9f760, 0)
ffbf9980 httpd`child_main+0x64c(22, 1999e8, 6b75c, 82f30, 803bc, 1)
ffbf9a18 httpd`make_child+0x13c(54400, 33, 0, 0, 34, 80000)
ffbf9a78 httpd`perform_idle_server_maintenance+0x1d0(80000, 57, 19, 803ac, 8337c, 4)
ffbf9b78 httpd`ap_mpm_run+0x678(80000, 0, ffffffff, 27, 803c4, 82f40)
ffbf9c20 main+0x9f4(8bdb0, ffbffd08, 8023c, 7f090, 64650, 7f080)
ffbffd20 _start+0x108(0, 0, 0, 0, 0, 0)

We use apache in prefork mode. The apache process fully utilizes 1 cpu when it's stuck. If we disable connection reusing with disableReuse=On, then there are no stuck apache processes.

Comment 11 Jeff Trawick 2010-11-03 14:51:52 UTC

I think some of the problem symptoms described here were fixed in 2.2.11 for PR45792.

http://svn.apache.org/viewvc?view=revision&revision=713146

Note that one of the folks chiming in on this thread already had 2.2.11 at the time.

(Apologies for claiming 2.2.14 earlier today in one of the httpd sessions at ApacheCon ;)   Some folks I support picked up the pool corruption/cleanup loop fix when *they* upgraded to 2.2.14.  I remembered incorrectly that 2.2.14 was the first release with the fix.)

Comment 12 Mina Galić 2010-12-03 18:59:03 UTC

After a month of runtime without running into this issue, I'm closing this ticket.

Comment 13 Eric Covener 2011-08-07 15:46:49 UTC

*** Bug 45483 has been marked as a duplicate of this bug. ***