hello We are facing a problem that "apachectl -k graceful" causes segmentation faults when requests are processed, and these requests sometimes fail. Could you give us any idea to solve(avoid) this problem? (This problem is similar to Bug#: 39834) We are using mod_jk 1.2.6(we tried 1.2.18 but has the same problem) with Apache 2.0.58 in RHELv3. JkLogLevel is set to "error"(debug,info,warn also reproduce more frequent). It may occur only when apache is writing to mod_jk.log, because it doesn't occur when "JkLogFile" configuration is commented our or when no requests are not processed. - Reproduction steps 1. making a rush with 10 multiple users 2. # apachectl -k graceful - Configration of jk in httpd.conf LoadModule jk_module modules/mod_jk.so <IfModule mod_jk.c> JkWorkersFile /opt/****/apache2/conf/workers.properties JkLogFile /opt/****/apache2/logs/mod_jk.log JkLogLevel error </IfModule> - error_log [Fri Aug 25 14:47:16 2006] [notice] Graceful restart requested, doing restart httpd: Could not determine the server's fully qualified domain name, using 127.0 .0.1 for ServerName [Fri Aug 25 14:47:16 2006] [notice] Apache configured -- resuming normal operati ons [Fri Aug 25 14:47:16 2006] [error] (43)Identifier removed: apr_global_mutex_lock (jk_log_lock) failed [Fri Aug 25 14:47:17 2006] [notice] child pid 1867 exit signal Segmentation faul t (11) [Fri Aug 25 14:47:17 2006] [error] (43)Identifier removed: apr_global_mutex_lock (jk_log_lock) failed [Fri Aug 25 14:47:18 2006] [error] (43)Identifier removed: apr_global_mutex_lock (jk_log_lock) failed [Fri Aug 25 14:47:18 2006] [error] (43)Identifier removed: apr_global_mutex_lock (jk_log_lock) failed [Fri Aug 25 14:47:18 2006] [notice] child pid 1865 exit signal Segmentation faul t (11) [Fri Aug 25 14:47:18 2006] [notice] child pid 1866 exit signal Segmentation faul t (11) [Fri Aug 25 14:47:18 2006] [notice] child pid 1868 exit signal Segmentation faul t (11) - mod_jk.log [Fri Aug 25 14:47:16 2006] [jk_ajp_common.c (1462)]: ERROR: Client connection a borted or network problems [Fri Aug 25 14:47:17 2006] [jk_ajp_common.c (1462)]: ERROR: Client connection a borted or network problems [Fri Aug 25 14:47:18 2006] [jk_ajp_common.c (1462)]: ERROR: Client connection a borted or network problems [Fri Aug 25 14:47:18 2006] [jk_ajp_common.c (1462)]: ERROR: Client connection a borted or network problems
I have reproduced this problem with 1.2.15 by followings. 1) set reply_timeout like 30(seconds) 2) post a request that Tomcat will sleep longer time than reply_timeout by browser 3) restart Apache with graceful option while Tomcat sleeping I read apache-2.0/mod_jk.c. If graceful is called while mod_jk is processing, apr_global_mutex_unlock will fail. Then, ap_log_rerror will be called. But this method is typo. So, Segmentation fault will occur around DSO. This typo has been fixed at 1.2.16.
Takayuki, Thank you for your reply. I have also confirmed Segmentation Fault had been fixed with 1.2.18(1.2.16 is not released). > Fri Aug 25 14:47:17 2006] [error] (43)Identifier removed: apr_global_mutex_lock > (jk_log_lock) failed I am confusing if this is a bug, or specification. Please teach me there is a problem for an application when this error is occured. If there is a problem(if request fails), let me know how to avoid or solve it. thank you
(In reply to comment #2) > I have also confirmed Segmentation Fault had been fixed with 1.2.18(1.2.16 is > not released). Your first post says it's still occurred with 1.2.18, which is correct? Is it reproducible with 1.2.18 ?
I'm sorry for confusing you. "(jk_log_lock) failed" error in error_log still occur with 1.2.18. "Segmentation Fault" error in mod_jk.log is solved with 1.2.18. Both errors occur with 1.2.6.
Could any of the commenters please give some information, if one of the problems still exists with 1.2.19 or the coming version 1.2.20? Otherwise we will close this case.
Hi everyone, I'm experiencing this issue on Solaris 10 with mod_jk 1.2.23 in Apache 2.0.53 with different log levels. Unfortunately I only have dbx available, whose output sometimes puzzles me a bit, but it looks like jk_log() itself triggers a second call of jk_log(). t@1 (l@1) signal SEGV (no mapping at the fault address) in (unknown) at 0x820fe6c 0x0820fe6c: outl (%dx) Current function is jk_log 581 l->log(l, level, buf); (dbx) where current thread: t@1 [1] 0x820fe6c(0x0, 0x0, 0x0, 0x0), at 0x820fe6c =>[2] jk_log(l = 0x820fe58, file = 0x1 "<bad address 0x1>", line = 134497984, funcname = 0x80466f8 "\xbe^F", level = 1633643370, fmt = 0x62 "<bad address 0x62>", ...), line 581 in "jk_util.c" [3] jk_log(l = 0x820fe58, file = 0x81032e4 "mod_jk.c", line = 452, funcname = 0x81032ac "ws_write", level = 1, fmt = 0x81032ed "written %d out of %d", ...), line 581 in "jk_util.c" [4] ws_write(s = 0x8047900, b = 0x824598b, l = 1726U), line 452 in "mod_jk.c" [5] ajp_process_callback(msg = 0x8224df4, pmsg = 0x8224e0c, ae = 0x8224dc0, r = 0x8047900, l = 0x8170a10), line 1447 in "jk_ajp_common.c" [6] ajp_get_reply(e = 0x8226de8, s = 0x8047900, l = 0x8170a10, p = 0x8224dc0, op = 0x80467f0), line 1647 in "jk_ajp_common.c" [7] ajp_service(e = 0x8226de8, s = 0x8047900, l = 0x8170a10, is_error = 0x804688c), line 1828 in "jk_ajp_common.c" [8] service(e = 0x8217d3c, s = 0x8047900, l = 0x8170a10, is_error = 0x80479ec), line 1007 in "jk_lb_worker.c" [9] jk_handler(r = 0x82379a8), line 2174 in "mod_jk.c" [10] ap_run_handler(r = 0x82379a8), line 152 in "config.c" [11] ap_invoke_handler(r = 0x82379a8), line 364 in "config.c" [12] ap_process_request(r = 0x82379a8), line 249 in "http_request.c" [13] ap_process_http_connection(c = 0x8231a68), line 253 in "http_core.c" [14] ap_run_process_connection(c = 0x8231a68), line 43 in "connection.c" [15] ap_process_connection(c = 0x8231a68, csd = 0x8231990), line 176 in "connection.c" [16] child_main(child_num_arg = 36), line 610 in "prefork.c" [17] make_child(s = 0x814b890, slot = 36), line 704 in "prefork.c" [18] perform_idle_server_maintenance(p = 0x8146770), line 839 in "prefork.c" [19] ap_mpm_run(_pconf = 0x8146770, plog = 0x817c848, s = 0x814b890), line 1040 in "prefork.c" [20] main(argc = 4, argv = 0x8047d88), line 623 in "main.c" (dbx) dump f = (nil) used = 0 args = (nil) buf = "" rc = 24 l = 0x820fe58 usable_size = 8191 file = 0x1 "<bad address 0x1>" fmt = 0x62 "<bad address 0x62>" funcname = 0x80466f8 "\xbe^F" level = 1633643370 line = 134497984 (dbx) up Current function is jk_log 581 l->log(l, level, buf); (dbx) print buf buf = "[Wed Dec 12 13:04:44 2007] [15364:0000] [debug] ws_write::mod_jk.c (452): written 1726 out of 1726" SunOS 5.10 Generic_118855-36 i86pc i386 i86pc Apache/2.0.59 (Unix) mod_ssl/2.0.59 OpenSSL/0.9.8e mod_jk/1.2.23 Cheers, Daniel
Can you easily reproduce it? If so: it would be very helpful, if you could compile and use the sources avaiable at http://people.apache.org/~rjung/mod_jk-dev/ Those are from a 1.2.26 development snapshot.
(In reply to comment #7) No, unfortunately I haven't yet found out how to reproduce it. After reading the bug reports I was quite sure that it was caused by apachectl graceful, as the problem only showed up when logadm had run. Executing logadm or manually issuing apachectl graceful and/or truncating the log files doesn't trigger the error though. This may even be a totally different issue to what the reporter described. I installed the dev snapshot and hope to see the error come up again soon.
(In reply to comment #8) > I installed the dev snapshot and hope to see the error come up again soon. The error never occured again since. I tried to make sure that no changes were introduced in the compile process, so I guess the problem was already fixed by another code change.
Closing now, since at the moment we have no more observations of this for recent versions.