Summary: | Some processes never terminate after graceful restart | ||
---|---|---|---|
Product: | Apache httpd-2 | Reporter: | jaroslav |
Component: | mpm_worker | Assignee: | Apache HTTPD Bugs Mailing List <bugs> |
Status: | NEW --- | ||
Severity: | normal | ||
Priority: | P2 | ||
Version: | 2.4.38 | ||
Target Milestone: | --- | ||
Hardware: | PC | ||
OS: | Linux | ||
Attachments: | GDB output |
Description
jaroslav
2019-11-28 14:27:38 UTC
> There is no mention about which specific resource was exhausted, my guess would be some process/thread-related limit yes, the limits are discussed in the pthread_create() manpage. You should get backtraces of these processes using gdb. https://httpd.apache.org/dev/debugging.html Created attachment 36904 [details]
GDB output
The most interesting (to me) seems to be thread 77, which is waiting in the read() call. /proc/pid/syscall/ output for that thread is:
0 0xf 0x7fc5a27cc75f 0x1 0x0 0x0 0x0 0x7fc5a27cc6e0 0x7fc5cfa48544
If I understand that correctly, it is a read from FD 15, which is (from /proc/pid/fd/15) pipe:[70519261]
According to lsof, this pipe is exclusive to the parent process 8904 and its threads and is not used by any other process in the system
Changing status back to NEW after providing the GDB output - if more information is needed, let me know Looks like there is a request handled by mod_fcgid that never finishes due to a hanging FCGI process. Hello, could you please elaborate what hanging FCGI process means in this context? The server uses mod_fcgid combined with suexec to launch php-cgi processes. It is quite possible for PHP process to hang - or more precisely to take too much time to process the request - but in that case mod_fcgid terminates it forcibly (SIGTERM, then KILL.) Also all FCGI processes are terminated on graceful restart and we even have a protective cron that hunts down orphaned PHP processes left in the system when Apache crashes. In short - there are no FCGI processes older than an hour in the system so the Apache process (now over 36 hours old and inactive for most of that time) has nothing to wait for - all connections related to communications between Apache and FCGI should read EOF Although, investigating this I noticed one more thing. The thread stuck reading from FD 15 is reading from "pipe:[70519261]" - ls -l /proc/pid/fd/ shows lr-x------ 1 root root 64 Nov 29 13:49 15 -> 'pipe:[70519261]' However, there is also this: l-wx------ 1 root root 64 Nov 29 13:49 16 -> 'pipe:[70519261]' Apparently, the inactive Apache process is reading from the pipe, but also has it opened for writing so it will never return an EOF. So reason for why the Apache process is stuck is because at some point some file descriptor (here FD 16) leaked and was not closed when it should have been? |