40316 – apachectl -k graceful causes segmentation fault

Bug 40316 - apachectl -k graceful causes segmentation fault

Summary: apachectl -k graceful causes segmentation fault

Status:	RESOLVED FIXED

Alias:	None

Product:	Tomcat Connectors
Classification:	Unclassified
Component:	Common (show other bugs)
Version:	unspecified
Hardware:	All Linux

Importance:	P2 normal (vote)
Target Milestone:	---
Assignee:	Tomcat Developers Mailing List

URL:
Keywords:

Depends on:	39834
Blocks:
	Show dependency tree

Reported:	2006-08-25 07:30 UTC by Yutaka Tanaka
Modified:	2008-10-05 03:09 UTC (History)
CC List:	0 users

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Yutaka Tanaka 2006-08-25 07:30:42 UTC

hello

We are facing a problem that "apachectl -k graceful" causes segmentation faults 
when requests are processed, and these requests sometimes fail.
Could you give us any idea to solve(avoid) this problem?
(This problem is similar to Bug#: 39834)

We are using mod_jk 1.2.6(we tried 1.2.18 but has the same problem) with Apache 
2.0.58 in RHELv3.
JkLogLevel is set to "error"(debug,info,warn also reproduce more frequent).
It may occur only when apache is writing to mod_jk.log, because it doesn't 
occur when "JkLogFile" configuration is commented our or when no requests are 
not processed.


- Reproduction steps
 1. making a rush with 10 multiple users
 2. # apachectl -k graceful


- Configration of jk in httpd.conf
LoadModule jk_module modules/mod_jk.so
<IfModule mod_jk.c>
        JkWorkersFile /opt/****/apache2/conf/workers.properties
        JkLogFile /opt/****/apache2/logs/mod_jk.log
        JkLogLevel error
</IfModule>

- error_log
[Fri Aug 25 14:47:16 2006] [notice] Graceful restart requested, doing restart
httpd: Could not determine the server's fully qualified domain name, using 127.0
.0.1 for ServerName
[Fri Aug 25 14:47:16 2006] [notice] Apache configured -- resuming normal operati
ons
[Fri Aug 25 14:47:16 2006] [error] (43)Identifier removed: apr_global_mutex_lock
(jk_log_lock) failed
[Fri Aug 25 14:47:17 2006] [notice] child pid 1867 exit signal Segmentation faul
t (11)
[Fri Aug 25 14:47:17 2006] [error] (43)Identifier removed: apr_global_mutex_lock
(jk_log_lock) failed
[Fri Aug 25 14:47:18 2006] [error] (43)Identifier removed: apr_global_mutex_lock
(jk_log_lock) failed
[Fri Aug 25 14:47:18 2006] [error] (43)Identifier removed: apr_global_mutex_lock
(jk_log_lock) failed
[Fri Aug 25 14:47:18 2006] [notice] child pid 1865 exit signal Segmentation faul
t (11)
[Fri Aug 25 14:47:18 2006] [notice] child pid 1866 exit signal Segmentation faul
t (11)
[Fri Aug 25 14:47:18 2006] [notice] child pid 1868 exit signal Segmentation faul
t (11)                            


- mod_jk.log
[Fri Aug 25 14:47:16 2006]  [jk_ajp_common.c (1462)]: ERROR: Client connection a
borted or network problems
[Fri Aug 25 14:47:17 2006]  [jk_ajp_common.c (1462)]: ERROR: Client connection a
borted or network problems
[Fri Aug 25 14:47:18 2006]  [jk_ajp_common.c (1462)]: ERROR: Client connection a
borted or network problems
[Fri Aug 25 14:47:18 2006]  [jk_ajp_common.c (1462)]: ERROR: Client connection a
borted or network problems

Comment 1 Takayuki Kaneko 2006-09-14 14:25:56 UTC

I have reproduced this problem with 1.2.15 by followings.

1) set reply_timeout like 30(seconds)
2) post a request that Tomcat will sleep longer time than reply_timeout by browser
3) restart Apache with graceful option while Tomcat sleeping

I read apache-2.0/mod_jk.c.
If graceful is called while mod_jk is processing, apr_global_mutex_unlock will fail.
Then, ap_log_rerror will be called. But this method is typo.
So, Segmentation fault will occur around DSO.

This typo has been fixed at 1.2.16.

Comment 2 Yutaka Tanaka 2006-09-22 06:13:36 UTC

Takayuki,

Thank you for your reply.
I have also confirmed Segmentation Fault had been fixed with 1.2.18(1.2.16 is 
not released).

> Fri Aug 25 14:47:17 2006] [error] (43)Identifier removed: 
apr_global_mutex_lock
> (jk_log_lock) failed

I am confusing if this is a bug, or specification.
Please teach me there is a problem for an application when this error is 
occured.
If there is a problem(if request fails), let me know how to avoid or solve it.

thank you

Comment 3 Takayoshi Kimura 2006-09-22 07:43:18 UTC

(In reply to comment #2)
> I have also confirmed Segmentation Fault had been fixed with 1.2.18(1.2.16 is 
> not released).

Your first post says it's still occurred with 1.2.18, which is correct?
Is it reproducible with 1.2.18 ?

Comment 4 Yutaka Tanaka 2006-09-22 08:46:19 UTC

I'm sorry for confusing you.

"(jk_log_lock) failed" error in error_log still occur with 1.2.18.
"Segmentation Fault" error in mod_jk.log is solved with 1.2.18.

Both errors occur with 1.2.6.

Comment 5 Rainer Jung 2006-11-18 20:27:55 UTC

Could any of the commenters please give some information, if one of the problems
still exists with 1.2.19 or the coming version 1.2.20?

Otherwise we will close this case.

Comment 6 Daniel Albers 2007-12-12 08:25:58 UTC

Hi everyone,

I'm experiencing this issue on Solaris 10 with mod_jk 1.2.23 in Apache 2.0.53
with different log levels.
Unfortunately I only have dbx available, whose output sometimes puzzles me a
bit, but it looks like jk_log() itself triggers a second call of jk_log().

t@1 (l@1) signal SEGV (no mapping at the fault address) in (unknown) at 0x820fe6c
0x0820fe6c:     outl     (%dx)
Current function is jk_log
  581           l->log(l, level, buf);
(dbx) where
current thread: t@1
  [1] 0x820fe6c(0x0, 0x0, 0x0, 0x0), at 0x820fe6c
=>[2] jk_log(l = 0x820fe58, file = 0x1 "<bad address 0x1>", line = 134497984,
funcname = 0x80466f8 "\xbe^F", level = 1633643370, fmt = 0x62 "<bad address
0x62>", ...), line 581 in "jk_util.c"
  [3] jk_log(l = 0x820fe58, file = 0x81032e4 "mod_jk.c", line = 452, funcname =
0x81032ac "ws_write", level = 1, fmt = 0x81032ed "written %d out of %d", ...),
line 581 in "jk_util.c"
  [4] ws_write(s = 0x8047900, b = 0x824598b, l = 1726U), line 452 in "mod_jk.c"
  [5] ajp_process_callback(msg = 0x8224df4, pmsg = 0x8224e0c, ae = 0x8224dc0, r
= 0x8047900, l = 0x8170a10), line 1447 in "jk_ajp_common.c"
  [6] ajp_get_reply(e = 0x8226de8, s = 0x8047900, l = 0x8170a10, p = 0x8224dc0,
op = 0x80467f0), line 1647 in "jk_ajp_common.c"
  [7] ajp_service(e = 0x8226de8, s = 0x8047900, l = 0x8170a10, is_error =
0x804688c), line 1828 in "jk_ajp_common.c"
  [8] service(e = 0x8217d3c, s = 0x8047900, l = 0x8170a10, is_error =
0x80479ec), line 1007 in "jk_lb_worker.c"
  [9] jk_handler(r = 0x82379a8), line 2174 in "mod_jk.c"
  [10] ap_run_handler(r = 0x82379a8), line 152 in "config.c"
  [11] ap_invoke_handler(r = 0x82379a8), line 364 in "config.c"
  [12] ap_process_request(r = 0x82379a8), line 249 in "http_request.c"
  [13] ap_process_http_connection(c = 0x8231a68), line 253 in "http_core.c"
  [14] ap_run_process_connection(c = 0x8231a68), line 43 in "connection.c"
  [15] ap_process_connection(c = 0x8231a68, csd = 0x8231990), line 176 in
"connection.c"
  [16] child_main(child_num_arg = 36), line 610 in "prefork.c"
  [17] make_child(s = 0x814b890, slot = 36), line 704 in "prefork.c"
  [18] perform_idle_server_maintenance(p = 0x8146770), line 839 in "prefork.c"
  [19] ap_mpm_run(_pconf = 0x8146770, plog = 0x817c848, s = 0x814b890), line
1040 in "prefork.c"
  [20] main(argc = 4, argv = 0x8047d88), line 623 in "main.c"
(dbx) dump
f = (nil)
used = 0
args = (nil)
buf = ""
rc = 24
l = 0x820fe58
usable_size = 8191
file = 0x1 "<bad address 0x1>"
fmt = 0x62 "<bad address 0x62>"
funcname = 0x80466f8 "\xbe^F"
level = 1633643370
line = 134497984
(dbx) up
Current function is jk_log
  581           l->log(l, level, buf);
(dbx) print buf
buf = "[Wed Dec 12 13:04:44 2007] [15364:0000] [debug] ws_write::mod_jk.c (452):
written 1726 out of 1726"

SunOS 5.10 Generic_118855-36 i86pc i386 i86pc
Apache/2.0.59 (Unix) mod_ssl/2.0.59 OpenSSL/0.9.8e mod_jk/1.2.23

Cheers, Daniel

Comment 7 Rainer Jung 2007-12-12 08:53:59 UTC

Can you easily reproduce it?

If so: it would be very helpful, if you could compile and use the sources
avaiable at http://people.apache.org/~rjung/mod_jk-dev/

Those are from a 1.2.26 development snapshot.

Comment 8 Daniel Albers 2007-12-13 03:28:59 UTC

(In reply to comment #7)

No, unfortunately I haven't yet found out how to reproduce it. After reading the
bug reports I was quite sure that it was caused by apachectl graceful, as the
problem only showed up when logadm had run. Executing logadm or manually issuing
apachectl graceful and/or truncating the log files doesn't trigger the error
though. This may even be a totally different issue to what the reporter described.

I installed the dev snapshot and hope to see the error come up again soon.

Comment 9 Daniel Albers 2008-01-07 02:51:50 UTC

(In reply to comment #8)

> I installed the dev snapshot and hope to see the error come up again soon.

The error never occured again since. I tried to make sure that no changes were
introduced in the compile process, so I guess the problem was already fixed by
another code change.

Comment 10 Rainer Jung 2008-01-07 03:53:02 UTC

Closing now, since at the moment we have no more observations of this for recent
versions.