Summary: | problem with Failover in jk_lb_worker.c | ||
---|---|---|---|
Product: | Tomcat Connectors | Reporter: | Chuck Betts <chuck_betts> |
Component: | Common | Assignee: | Tomcat Developers Mailing List <dev> |
Status: | CLOSED WORKSFORME | ||
Severity: | normal | ||
Priority: | P2 | ||
Version: | unspecified | ||
Target Milestone: | --- | ||
Hardware: | Sun | ||
OS: | Solaris | ||
Bug Depends on: | 36525 | ||
Bug Blocks: | |||
Attachments: |
A full log file showing one failed request
workers.properties Mod_jk configuration file |
Description
Chuck Betts
2005-08-19 21:08:37 UTC
Created attachment 16117 [details]
A full log file showing one failed request
I would like to see your workers.properties and the Jk directives in your httpd configuration. Also: Could you please check, if the failing Apache child processes are still there, or if they died? The log chows the PID of the relevant process directly behind the date. If the process is still there, you can use /usr/proc/bin/pstack PID to write a thrad dump. That way we can confirm, in which function the hanging process sits. You could attach the pstack. If the process is not there any more, it might have crashed. In case you succeed in dumping a core file of the process, you can again use pstack on that core file. Finally it might be interesting to use truss (or if you are already familiar with it dtrace) to find out, if the error is related to some system call. Configure apache to only use very few children (like startng 2 servers and having sparemax and min also equals to 2). Start the server via truss -f -o some_output_file -w all -r all -v all \ /usr/local/apache/bin/apachectl start and then redo your tests (apache will be a little slow). The file some_output_file contains information about all system calls, about bytes read and written, signals received, errnos etc. To have a somewhat easier test case: In the log you attached one can see, that the working request for the html produces to image requests going to two apache processes in parallel which both fail. I ould be a simpler retest to only use single requests, not something in parallel, e.g. not automatically reloading embedded objevcts via the browser. Created attachment 16119 [details]
workers.properties
Created attachment 16120 [details]
Mod_jk configuration file
The pstack returned nothing, and I confirmed by watching TOP that the process is created then dies when the http request is over. I am having trouble running truss, I'll get back to you with that when I can. Config is pretty basic and looks OK. The process dead is strange and I don't see an immediate reason for it. It would be good, if you could reproduce it with without parallel requests to simplify the situation. Did you build apache and mod_jk yourself? Have they been compiled using the same compiler and the same CFLAGS? Sorry for all this. We were building on a seperate machine using the gcc compiler, because our primary Sun build machine wasnt working at the time. Now it is and I recompiled on it, and everything is hunkey-dorey now. FYI: The compiler we used was the Forte from Sun, the compiler flag is: CFLAGS="-mt -fast -xarch=v8plusa -xtarget=ultra2 -xO5 -KPIC" LDFLAGS="-L/lib -R/usr/local/lib -R/usr/lib -R/lib -lpthread" Most likely this is a duplicate of 36525 and fixed in the already released 1.2.15. Since it was a 64Bit alignment bug, some compilers might show the bug, others not. |