Bug 57268 - apache process crashes when downloading large file
Summary: apache process crashes when downloading large file
Status: RESOLVED FIXED
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: mpm_event (show other bugs)
Version: 2.4.10
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-11-26 14:46 UTC by jaroslav
Modified: 2014-12-02 14:07 UTC (History)
1 user (show)



Attachments
r1639614 (2.58 KB, patch)
2014-12-01 11:41 UTC, Yann Ylavic
Details | Diff
r1638879 + r1640031 (3.76 KB, patch)
2014-12-01 11:42 UTC, Yann Ylavic
Details | Diff
backtrace (1.64 KB, application/gzip)
2014-12-01 23:42 UTC, jaroslav
Details

Note You need to log in before you can comment on or make changes to this bug.
Description jaroslav 2014-11-26 14:46:53 UTC
When using event mpm, Apache processes randomly crash when serving request to download large file. It seems the crash occurs when file download takes long time (downloading 256MB file on 10Mbit line triggers the crash, but downloading the same file on gbit LAN works fine.) It also seems the crash only occurs when the server is busy serving other requests.

This is how I am able to reproduce the issue on my test server:

- create small file in document root (index.html with content "index") and large file large.bin (dd if=/dev/zero of=large.bin bs=$((1024**2)) count=256)
- run ab -q -n 2000000 -c 20 'http://[a:b:c:d::1]/index.html' on the server
- run wget 'http://[a:b:c:d::1]/large.bin' on my workstation (~5 minutes long download)

The server transmits some data and then wget reports "Connection closed at byte xxx. Retrying." The amount of transmitted data changes every try, sometimes the crash occurs after few seconds, sometimes it takes few minutes, in few cases the file downloaded without causing Apache to crash.

When the server was idle - only wget running, no ab - server didn't crash in any of my tries.

When the crash occurs, Apache logs this into error log:

[Wed Nov 26 14:38:14.982708 2014] [core:notice] [pid 12215:tid 140446535845760] AH00052: child pid 16079 exit signal Segmentation fault (11).

In some cases kernel logs message like this one into dmesg (message from different server, therefore different pid):

[1943221.021819] apache2[31073]: segfault at 7f3c380c52d8 ip 00007f3c59d66ab2 sp 00007f3c3afece30 error 4 in mod_mpm_event.so[7f3c59d60000+e000]

We are using Apache downloaded from Debian Jessie repository (apache2-bin), together with libapache2-mod-fcgid, php5-cgi, suexec (apache2-suexec-custom) and libapache2-mod-geoip. (PHP is not involved in processing any of the files mentioned above.) List of active modules follows (from server-info): core.c, event.c, http_core.c, mod_access_compat.c, mod_alias.c, mod_auth_basic.c, mod_authn_core.c, mod_authn_file.c, mod_authz_core.c, mod_authz_host.c, mod_authz_user.c, mod_autoindex.c, mod_deflate.c, mod_dir.c, mod_env.c, mod_fcgid.c, mod_filter.c, mod_geoip.c, mod_info.c, mod_log_config.c, mod_logio.c, mod_mime.c, mod_negotiation.c, mod_setenvif.c, mod_so.c, mod_status.c, mod_unique_id.c, mod_unixd.c, mod_version.c, mod_watchdog.c

If I recall correctly, testing server is running in default configuration from Debian package - let me know if you need some configuration details.

At the moment it seems switching to mpm_worker works around the problem.
Comment 1 Yann Ylavic 2014-12-01 11:38:16 UTC
There are 2 pending patches about mpm_event possible crashes in trunk (follow up attachements).
Can you apply those in your environment, and preferably test them separatly so that we can determine which one (if not both) should be backported?

Otherwise, can you provide a backtrace of the crash using gdb and a core file (http://httpd.apache.org/dev/debugging.html may help).

You will probably need to add "CoreDumpDirectory /tmp" in httpd.conf, and start httpd with unlimited core files (ulimit -c unlimited).

Once you have got a core(.pid) file in /tmp :
$ gdb /path/to/httpd -c /tmp/core.xxxx
% gdb) thread apply all bt

Then you could provide the output here.
Comment 2 Yann Ylavic 2014-12-01 11:41:39 UTC
Created attachment 32245 [details]
r1639614
Comment 3 Yann Ylavic 2014-12-01 11:42:41 UTC
Created attachment 32246 [details]
r1638879 + r1640031
Comment 4 jaroslav 2014-12-01 22:41:43 UTC
Hello, thank you for the reply.

Apache 2.4.10 in Debian is already shipped with patches r1638879 + r1640031 applied, so these don't fix this crash.

I applied the other one - r1639614 - on Debian source tree and so far it seems the issue is fixed with it. I am no longer able to reproduce the crash, even after multiple tries.

If you still need the backtrade or any other information, let me know...
Comment 5 Eric Covener 2014-12-01 23:02:11 UTC
(In reply to jaroslav from comment #4)
> Hello, thank you for the reply.
> 
> Apache 2.4.10 in Debian is already shipped with patches r1638879 + r1640031
> applied, so these don't fix this crash.
> 
> I applied the other one - r1639614 - on Debian source tree and so far it
> seems the issue is fixed with it. I am no longer able to reproduce the
> crash, even after multiple tries.
> 
> If you still need the backtrade or any other information, let me know...

Sorry to take you up on it, but it would be nice to see a backtrace w/o r1639614 just to confirm.  It is a bit odd that it helps for a crash mid-response.
Comment 6 jaroslav 2014-12-01 23:42:24 UTC
Created attachment 32247 [details]
backtrace
Comment 7 jaroslav 2014-12-01 23:46:52 UTC
No problem, added it into attachements. The testing server has apache2-dbg package (debugging symbols) installed, hope it helps.
Comment 8 Eric Covener 2014-12-02 00:26:44 UTC
(In reply to jaroslav from comment #7)
> No problem, added it into attachements. The testing server has apache2-dbg
> package (debugging symbols) installed, hope it helps.

thanks, does actually seem consistent at least on the surface.
Comment 9 Yann Ylavic 2014-12-02 14:05:00 UTC
Both r1639614 and r1638879+r1640031 backported to upcoming 2.4.11.
Thanks Jaroslav for reporting and testing.
Comment 10 jaroslav 2014-12-02 14:07:07 UTC
Thank you for the help.