Summary: | Crash, when Apache is shutting down | ||
---|---|---|---|
Product: | Rivet | Reporter: | Mikhail T. <mi+apache> |
Component: | mod_rivet | Assignee: | Apache Rivet Mailing list account <rivet-dev> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | mxmanghi |
Priority: | P2 | ||
Version: | 2.2.3 | ||
Target Milestone: | mod_rivet | ||
Hardware: | PC | ||
OS: | FreeBSD | ||
Attachments: | no channel unregistering |
Description
Mikhail T.
2015-06-07 12:15:33 UTC
Please, would you tell us what are the global parameters in the configuration? For example the output of puts [::rivet::inspect -all] or parray [array set [::rivet::inspect -all]] would help. Consider also to update your trunk working copy or re-export from trunk if you didn't create your own wc hold on, your code is of the 2.2 series, not trunk! What version of rivet are you using actually? (In reply to Massimo Manghi from comment #1) > puts [::rivet::inspect -all] Here it is in a table: http://aldan.algebra.com/~mi/tmp/t.rvt (In reply to Massimo Manghi from comment #2) > What version of rivet are you using actually? As I wrote from the very beginning: >> (I'm using rivet-2.2.3, but this latest version is not yet mentioned in the list of valid versions in Bugzilla.) Hope, this helps. Yes, you're right, I added it now. How many virtual hosts do you have in your configuration? (In reply to Massimo Manghi from comment #4) > Yes, you're right, I added it now. How many virtual hosts do you have in > your configuration? Quite a few (not even sure -- easily 10). Worse, some of them use Websh... I had a problem with the coexistence before (you may remember Bug 54162) -- maybe, we are witnessing a regression on that once-fixed problem? I remember that bug. You have at hand a method that might give a strong indication there is a conflict or not: disable your WebSH based web sites and see what happens Forgive my wrong wording of the recommendation: I mean to disable WebSh based stuff by not loading WebSH altogether and see what happens No, removing the websh-related pieces (the LoadModule and the websh-using vhost-definition) did not help :-/ (If you'd like to investigate in situ, send me your passwd-entry and ssh-key.) More specifically, ALL of the child httpd-processes crash. One of them often dies from SEGFAULT (signal 11): Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00000008011b28c3 in apr_pool_clear () from /opt/lib/libapr-1.so.0 (gdb) where #0 0x00000008011b28c3 in apr_pool_clear () from /opt/lib/libapr-1.so.0 #1 0x000000080573342f in child_main (child_num_arg=5) at prefork.c:597 #2 0x00000008057327fe in make_child (s=0x80246b050, slot=5) at prefork.c:800 #3 0x0000000805732d35 in perform_idle_server_maintenance (p=0x802421028) at prefork.c:902 #4 0x00000008057315b5 in prefork_run (_pconf=0x802421028, plog=0x80246f028, s=0x80246b050) at prefork.c:1090 #5 0x000000000043c63b in ap_run_mpm (pconf=0x802421028, plog=0x80246f028, s=0x80246b050) at mpm_common.c:94 #6 0x0000000000430649 in main (argc=2, argv=0x7fffffffe640) at main.c:777 (gdb) info thread Id Target Id Frame * 4 Thread 802406400 (LWP 101604) 0x00000008011b28c3 in apr_pool_clear () from /opt/lib/libapr-1.so.0 3 Thread 802407c00 (LWP 100778) 0x0000000801879cea in chdir () from /lib/libc.so.7 * 1 Thread 802406400 (LWP 101604) 0x00000008011b28c3 in apr_pool_clear () from /opt/lib/libapr-1.so.0 while all others die in Rivet_Panic as already submitted. Created attachment 32801 [details]
no channel unregistering
I can't reproduce the problem, but I think it's probably pedantic to unregister a channel just before the child exits (and consequently all its memory released)
would you please try if this patch has any side effects?
If you had a chance to test the code in trunk using the worker MPM the feedback would be highly valuable.
I tested your case on 2 machines with mixed results: it's been impossible to reproduce the problem on my PC at home, whereas a quick test on a PC at work showed a similar backtrace. In this case one of the calls in the backtrace was to TclX which wasn't called at all from both mod_rivet and the application code. I deinstalled TclX but the problem still existed with a slightly different bt output The problem seems to be around Tcl_UnregisterChannel, but given your configuration I don't understand how since it should be called only once Notice also that the other backtrace you showed has no calls to mod_rivet in the stack.We should understand if there is still a inter-module interference somehow Are you running Apache 2.2 or 2.4? Which patchlevel version exactly? With the patch applied, the crash happens a few steps later, when the interpreter itself is deleted:
#0 0x00000008019501aa in thr_kill () from /lib/libc.so.7
#1 0x0000000801950116 in raise () from /lib/libc.so.7
#2 0x000000080194e8f9 in abort () from /lib/libc.so.7
#3 0x0000000808b7d08f in ?? () from /opt/libexec/apache24/mod_rivet.so
#4 0x00000008088e0d84 in Tcl_PanicVA (format=<optimized out>, argList=<optimized out>)
at /home/ports/lang/tcl86/work/tcl8.6.4/generic/tclPanic.c:99
#5 0x00000008088e0e72 in Tcl_Panic (format=0x18b90 <error: Cannot access memory at address 0x18b90>)
at /home/ports/lang/tcl86/work/tcl8.6.4/generic/tclPanic.c:153
#6 0x000000080881a60f in Tcl_AsyncDelete (async=0x8024fbed0) at /home/ports/lang/tcl86/work/tcl8.6.4/generic/tclAsync.c:283
#7 0x000000080881c021 in DeleteInterpProc (interp=0x81c582010)
at /home/ports/lang/tcl86/work/tcl8.6.4/generic/tclBasic.c:1423
#8 0x0000000808b7c90f in ?? () from /opt/libexec/apache24/mod_rivet.so
#9 0x00000008011b24be in apr_pool_destroy () from /opt/lib/libapr-1.so.0
#10 0x0000000805733872 in clean_child_exit (code=0) at prefork.c:218
#11 0x0000000805733807 in just_die (sig=15) at prefork.c:344
#12 0x00000008015f38eb in ?? () from /lib/libthr.so.3
#13 0x00000008015f2ffc in ?? () from /lib/libthr.so.3
#14 <signal handler called>
#15 0x000000080195136a in _select () from /lib/libc.so.7
#16 0x00000008015f0f72 in ?? () from /lib/libthr.so.3
#17 0x000000080891be70 in NotifierThreadProc (clientData=<optimized out>) at tclUnixNotfy.c:1216
#18 0x00000008015ee775 in ?? () from /lib/libthr.so.3
#19 0x0000000000000000 in ?? ()
> Are you running Apache 2.2 or 2.4? Which patchlevel version exactly?
Using apache-2.4.12.
it should be OK now with rivet 2.2.4 |