Summary: | Creating a large number of SSL sites using DBDDriver pgsql causes a SIGSEGV / SIGILL on load | ||
---|---|---|---|
Product: | Apache httpd-2 | Reporter: | Alex Bligh <alex> |
Component: | mod_ssl | Assignee: | Apache HTTPD Bugs Mailing List <bugs> |
Status: | RESOLVED DUPLICATE | ||
Severity: | major | CC: | alex |
Priority: | P2 | ||
Version: | 2.4.10 | ||
Target Milestone: | --- | ||
Hardware: | PC | ||
OS: | Linux | ||
Attachments: |
Perl program to generate config to replicate the bug
Perl program to replicate the bug under 2.4.10 Demonstration patch to work around the bug |
Description
Alex Bligh
2014-09-06 13:11:42 UTC
BZ 54357 contains another user report of a crash in the same stack directly after start for 2.4.7, which was commented by the same user as being fixed for him after updating to 2.4.9. Any chance you can update to latest 2.4 and try again? [ Note for anyone trying to duplicate this: On a clean container on the same machine, I needed 141 or more sites to duplicate this. Also it appears it is necessary to enable mod_php] On 2.4.10_1ubuntu1 (utopic version recompiled for trusty), this appears not to occur, which is good news. BZ 54357 appears to involve certificate stapling, which I have switched off (I believe that's the default). I would rather use 2.4.7 if possible simply because that is the stock version Ubuntu distribute and support. Failing that, I'm happy to identify the specific issue, recompile, and try to persuade Ubuntu to apply a patch to 2.4.7. Any idea what the underlying issue is here, or how I might work around it without an upgrade? Looks like I spoke too soon. This *DOES* occur on 2.4.10, it's just more difficult to replicate. Of course it replicates just fine with my real-world example. Here's a backtrace of it dying on 2.4.10. I will try to amend the test case to replicate this. In the meantime is there anything further I can do to debug this? root@nimtest:~# gdb --args /usr/sbin/apache2 -k start -X -e Debug GNU gdb (Ubuntu 7.7-0ubuntu3.1) 7.7 Copyright (C) 2014 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /usr/sbin/apache2...Reading symbols from /usr/lib/debug//usr/sbin/apache2...done. done. (gdb) run Starting program: /usr/sbin/apache2 -k start -X -e Debug [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". [Mon Sep 08 15:53:47.686373 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module access_compat_module from /usr/lib/apache2/modules/mod_access_compat.so [Mon Sep 08 15:53:47.690215 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module alias_module from /usr/lib/apache2/modules/mod_alias.so [Mon Sep 08 15:53:47.695217 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module auth_basic_module from /usr/lib/apache2/modules/mod_auth_basic.so [Mon Sep 08 15:53:47.697928 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module authn_core_module from /usr/lib/apache2/modules/mod_authn_core.so [Mon Sep 08 15:53:47.703892 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module authn_file_module from /usr/lib/apache2/modules/mod_authn_file.so [Mon Sep 08 15:53:47.708513 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module authz_core_module from /usr/lib/apache2/modules/mod_authz_core.so [Mon Sep 08 15:53:47.714280 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module authz_groupfile_module from /usr/lib/apache2/modules/mod_authz_groupfile.so [Mon Sep 08 15:53:47.717910 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module authz_host_module from /usr/lib/apache2/modules/mod_authz_host.so [Mon Sep 08 15:53:47.725992 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module authz_user_module from /usr/lib/apache2/modules/mod_authz_user.so [Mon Sep 08 15:53:47.733997 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module autoindex_module from /usr/lib/apache2/modules/mod_autoindex.so [Mon Sep 08 15:53:47.739117 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module cache_module from /usr/lib/apache2/modules/mod_cache.so [Mon Sep 08 15:53:47.744871 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module cgi_module from /usr/lib/apache2/modules/mod_cgi.so [Mon Sep 08 15:53:47.750762 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module dbd_module from /usr/lib/apache2/modules/mod_dbd.so [Mon Sep 08 15:53:47.757628 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module deflate_module from /usr/lib/apache2/modules/mod_deflate.so [Mon Sep 08 15:53:47.765739 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module dir_module from /usr/lib/apache2/modules/mod_dir.so [Mon Sep 08 15:53:47.772183 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module env_module from /usr/lib/apache2/modules/mod_env.so [Mon Sep 08 15:53:47.780369 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module filter_module from /usr/lib/apache2/modules/mod_filter.so [Mon Sep 08 15:53:47.788832 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module headers_module from /usr/lib/apache2/modules/mod_headers.so [Mon Sep 08 15:53:47.794207 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module ident_module from /usr/lib/apache2/modules/mod_ident2.so [Mon Sep 08 15:53:47.797959 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module lbmethod_byrequests_module from /usr/lib/apache2/modules/mod_lbmethod_byrequests.so [Mon Sep 08 15:53:47.801879 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module mime_module from /usr/lib/apache2/modules/mod_mime.so [Mon Sep 08 15:53:47.806730 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module mpm_prefork_module from /usr/lib/apache2/modules/mod_mpm_prefork.so [Mon Sep 08 15:53:47.813710 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module negotiation_module from /usr/lib/apache2/modules/mod_negotiation.so [Mon Sep 08 15:53:47.952346 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module php5_module from /usr/lib/apache2/modules/libphp5.so [Mon Sep 08 15:53:47.957451 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module proxy_module from /usr/lib/apache2/modules/mod_proxy.so [Mon Sep 08 15:53:47.960908 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module proxy_balancer_module from /usr/lib/apache2/modules/mod_proxy_balancer.so [Mon Sep 08 15:53:47.964292 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module proxy_http_module from /usr/lib/apache2/modules/mod_proxy_http.so [Mon Sep 08 15:53:47.967260 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module reqtimeout_module from /usr/lib/apache2/modules/mod_reqtimeout.so [Mon Sep 08 15:53:47.971368 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module rewrite_module from /usr/lib/apache2/modules/mod_rewrite.so [Mon Sep 08 15:53:47.974517 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module setenvif_module from /usr/lib/apache2/modules/mod_setenvif.so [Mon Sep 08 15:53:47.977591 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module slotmem_shm_module from /usr/lib/apache2/modules/mod_slotmem_shm.so [Mon Sep 08 15:53:47.980582 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module socache_shmcb_module from /usr/lib/apache2/modules/mod_socache_shmcb.so [Mon Sep 08 15:53:47.990700 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module ssl_module from /usr/lib/apache2/modules/mod_ssl.so [Mon Sep 08 15:53:47.994364 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module status_module from /usr/lib/apache2/modules/mod_status.so [Mon Sep 08 15:53:47.997902 2014] [so:debug] [pid 15446] mod_so.c(266): AH01575: loaded module substitute_module from /usr/lib/apache2/modules/mod_substitute.so AH00548: NameVirtualHost has no effect and will be removed in the next release /etc/apache2/sites-enabled/000-extility-amber-listen.conf:15 AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 127.0.1.1. Set the 'ServerName' directive globally to suppress this message [New Thread 0x7fffe7166700 (LWP 15462)] [Thread 0x7fffe7166700 (LWP 15462) exited] Program received signal SIGSEGV, Segmentation fault. 0x00007ffff03e5139 in ?? () from /usr/lib/apache2/modules/mod_ssl.so (gdb) bt full #0 0x00007ffff03e5139 in ?? () from /usr/lib/apache2/modules/mod_ssl.so No symbol table info available. #1 0x00007ffff274f7a6 in int_free_ex_data (class_index=<optimized out>, obj=0x555555b03830, ad=0x555555b03858) at ex_data.c:522 mx = 2 i = 0 item = 0x5555558331f0 ptr = <optimized out> storage = 0x555555b03ab0 #2 0x00007ffff27f0061 in x509_cb (operation=operation@entry=3, pval=pval@entry=0x7fffffffdfe8, it=it@entry=0x7ffff2aab780 <X509_it>, exarg=exarg@entry=0x0) at x_x509.c:113 ret = 0x555555b03830 #3 0x00007ffff27f3fea in asn1_item_combine_free (pval=pval@entry=0x7fffffffdfe8, it=it@entry=0x7ffff2aab780 <X509_it>, combine=combine@entry=0) at tasn_fre.c:173 tt = <optimized out> seqtt = <optimized out> ef = <optimized out> cf = <optimized out> aux = <optimized out> asn1_cb = 0x7ffff27effa0 <x509_cb> i = <optimized out> #4 0x00007ffff27f41c5 in ASN1_item_free (val=0x555555b03830, it=it@entry=0x7ffff2aab780 <X509_it>) at tasn_fre.c:71 No locals. #5 0x00007ffff27f014c in X509_free (a=<optimized out>) at x_x509.c:141 No locals. #6 0x00007ffff24caf2d in SSL_load_client_CA_file (file=<optimized out>) at ssl_cert.c:726 in = 0x555555b02990 x = 0x555555b03830 xn = <optimized out> ret = <optimized out> sk = 0x555555b04b70 #7 0x00007ffff03ca871 in ssl_init_PushCAList (ca_list=0x555555b04190, s=0x7fffebad9238, ptemp=0x7ffff7fc0028, file=<optimized out>) at ssl_engine_init.c:1587 n = <optimized out> sk = <optimized out> #8 0x00007ffff03cae50 in ssl_init_FindCAList (s=s@entry=0x7fffebad9238, ptemp=ptemp@entry=0x7ffff7fc0028, ca_file=0x7fffebad75c0 "/etc/ssl/certs/extility-cluster-ca.crt", ca_path=0x7fffebad7578 "/etc/ssl/none") at ssl_engine_init.c:1637 ca_list = 0x555555b04190 #9 0x00007ffff03cb38e in ssl_init_ctx_verify (p=0x7ffff7ff0028, mctx=0x7ffff7e18140, ptemp=0x7ffff7fc0028, s=0x7fffebad9238) at ssl_engine_init.c:674 ctx = 0x555555b01d30 verify = <optimized out> ca_list = <optimized out> #10 ssl_init_ctx (s=0x7fffebad9238, p=0x7ffff7ff0028, ptemp=0x7ffff7fc0028, mctx=0x7ffff7e18140) at ssl_engine_init.c:863 No locals. #11 0x00007ffff03cc4d8 in ssl_init_server_ctx (pphrases=0x7ffff7eab110, sc=0x7ffff7e3ff50, ptemp=0x7ffff7fc0028, p=0x7ffff7ff0028, s=0x7fffebad9238) at ssl_engine_init.c:1370 rv = <optimized out> #12 ssl_init_ConfigureServer (s=s@entry=0x7fffebad9238, p=p@entry=0x7ffff7ff0028, ptemp=ptemp@entry=0x7ffff7fc0028, sc=0x7ffff7e3ff50, pphrases=pphrases@entry=0x7ffff7eab110) at ssl_engine_init.c:1469 No locals. #13 0x00007ffff03cd319 in ssl_init_Module (p=0x7ffff7ff0028, plog=<optimized out>, ptemp=0x7ffff7fc0028, base_server=0x7ffff7fc2de0) at ssl_engine_init.c:304 mc = <optimized out> sc = <optimized out> s = 0x7fffebad9238 rv = <optimized out> pphrases = 0x7ffff7eab110 #14 0x00005555555ab019 in ap_run_post_config (pconf=0x7ffff7ff0028, plog=0x7ffff7fbe028, ptemp=0x7ffff7fc0028, s=0x7ffff7fc2de0) at config.c:103 pHook = 0x7ffff7eef0c0 n = 15 rv = 540686391 #15 0x000055555558b137 in main (argc=6, argv=0x7fffffffe598) at main.c:765 c = 101 'e' showcompile = 0 showdirectives = 0 confname = 0x5555555cb4e7 "apache2.conf" def_server_root = 0x5555555cb4da "/etc/apache2" temp_error_log = 0x0 error = <optimized out> process = 0x7ffff7ff2118 pconf = 0x7ffff7ff0028 plog = 0x7ffff7fbe028 ptemp = 0x7ffff7fc0028 pcommands = 0x7ffff7fc8028 opt = 0x7ffff7fc8118 rv = <optimized out> mod = 0x5555557ed160 <ap_prelinked_modules+64> opt_arg = 0x7fffffffe826 "Debug" signal_server = <optimized out> (gdb) Created attachment 31975 [details]
Perl program to replicate the bug under 2.4.10
I've attached what I believe is the minimal perl program to replicate the bug under 2.4.10.
This simply adds one line to the server config:
SSLCACertificateFile /etc/ssl/certs/ssl-cert-snakeoil.pem
Perhaps there was something fixed in normal SSLCertificate loading that has yet to be fixed in CA certificate loading.
(note, you can have a single site that uses an SSLCACertificateFile after a lot of sites not using them, and still see the bug) I believe I now understand the root cause of this bug and it's very NASTY. I don't think it's properly fixed in 2.4.10 for any certificates. What appears to be happening is this. In main.c (prior to line 702), apache processes the config file. This dlopen()'s mod_ssl, and calls ssl_init_Module(). At main.c line 707, inside the 'for (;;)' loop around reloads, it does an apr_pool_clear(). This dlclose()'s all the modules that have been open, and reprocesses the configuration, which again dlopen()'s mod_ssl and calls ssl_init_Module() again. However, inspection shows that mod_ssl isn't always loaded at the same address. If you have a large configuration, that's more likely (not sure whether it's a memory leak, or fragmentation, or what, but this is the cause). For instance, I put a breakpoint on ssl_init_Module(), and then ran apache2 and got the output below. You can see the location of ssl_init_Module has changed. This in itself would not be an issue. However, when the certificates are first loaded through openssl, they are set up with a free_func (in the openssl structure) that points to something in mod_ssl freeing the additional storage - see http://osxr.org/openssl/source/crypto/ex_data.c#0566 When the config file is reprocessed, that free_func's address changes. However, openssl object has not (yet) been freed. When it is, the free_func() is called using the PREVIOUS address associated with mod_ssl. What I believe is happening here is that the SSL library is checking to see whether a certificate with the same CN has already been loaded, here: http://osxr.org/openssl/source/ssl/ssl_cert.c#0707 and deinitialising mod_ssl is not clearing the loaded certificates. This is probably deliberate as there may be other users of the openssl library that might also be loading certificates. I don't really know how one would go about fixing this. The least horrible option I think would be to never dlclose() a module once it is loaded. IE a reload of apache2 would leave modules in RAM (but presumably deinited), so that they would always be at the same place. A reload would then leave them loaded. Breakpoint 1, ssl_init_Module (p=0x7ffff7ff0028, plog=0x7ffff7fbe028, ptemp=0x7ffff7fbc028, base_server=0x7ffff7fc1ec8) at ssl_engine_init.c:138 138 { (gdb) print pc No symbol "pc" in current context. (gdb) print &ssl_init_Module $1 = (apr_status_t (*)(apr_pool_t *, apr_pool_t *, apr_pool_t *, server_rec *)) 0x7ffff03d7000 <ssl_init_Module> (gdb) cont Continuing. warning: Temporarily disabling breakpoints for unloaded shared library "/usr/lib/apache2/modules/mod_ssl.so" [New Thread 0x7fffe6fee700 (LWP 56253)] [Thread 0x7fffe6fee700 (LWP 56253) exited] Breakpoint 1, ssl_init_Module (p=0x7ffff7ff0028, plog=0x7ffff7fbe028, ptemp=0x7ffff7fc0028, base_server=0x7ffff7fc2de0) at ssl_engine_init.c:138 138 { (gdb) print &ssl_init_Module $2 = (apr_status_t (*)(apr_pool_t *, apr_pool_t *, apr_pool_t *, server_rec *)) 0x7ffff03cd000 <ssl_init_Module> (g Created attachment 31977 [details]
Demonstration patch to work around the bug
A minimum patch for this bug is attached. This swaps the SEGV for a memory leak, on the basis that a memory leak is probably less bad. I am neither sure this is suitable or a complete solution.
The patch works as follows: the problem is that the address of certinfo_free is being stored somewhere deep in openssl. When modssl is dlclosed()'d and dlopen()'d again, the address of certinfo_free may change. openssl then calls the free function at its old location, and SEGV / illegal instruction ensues. By not providing a free function for the extra data, we avoid openssl calling anything.
This appears to avoid the test case crashing, which at least means the problem is correctly identified.
I would guess the proper cleanup is missing in ssl_init_ModuleKill. I'm not sure what the proper way to fix is this. It's tempting to call CRYPTO_cleanup_all_ex_data, but I don't think that's the right solution. Firstly ssl_cleanup_pre_config says: /* Also don't call CRYPTO_cleanup_all_ex_data here; any registered * ex_data indices may have been cached in static variables in * OpenSSL; removing them may cause havoc. Notably, with OpenSSL * versions >= 0.9.8f, COMP_CTX cleanups would not be run, which * could result in a per-connection memory leak (!). */ Secondly some other ssl user (for instance a DBD driver using an SSL interface to the database) may not take kindly to us stomping on its data. It's tempting to remove the index that X509_get_ex_new_index added, removing the data, save that as far as I can see openssl doesn't have an API call to do that. That would leave us attempting to ensure that every single object that mod_ssl allocates is freed. But firstly, I'm not sure how to do that, and secondly this won't fix the problem where there is some other ssl user that also allocates objects. It would also be inherently fragile. The final option would be to rewrite the stapling code so it didn't use ex_data at all. To me this seems like the best route, but I don't understand the stapling code well enough to do it. Is there some easier option I have missed? Thank you for the thorough debugging and analysis, Alex. I think it's really a duplicate of bug 54357, and it would be best to dupe this one into it (or vice versa). (In reply to Alex Bligh from comment #9) > The final option would be to rewrite the stapling code so it didn't use > ex_data at all. To me this seems like the best route, but I don't understand > the stapling code well enough to do it. > > Is there some easier option I have missed? One option might be to avoid ex_data fiddling in the "first round", based on a ssl_config_global_isfixed() check - i.e., something like this (untested): Index: ssl_engine_init.c =================================================================== --- ssl_engine_init.c (revision 1624017) +++ ssl_engine_init.c (working copy) @@ -272,7 +272,9 @@ return HTTP_INTERNAL_SERVER_ERROR; } #ifdef HAVE_OCSP_STAPLING - ssl_stapling_ex_init(); + if (ssl_config_global_isfixed(mc) == TRUE) { + ssl_stapling_ex_init(); + } #endif /* @@ -1067,6 +1069,7 @@ * later, we defer to the code in ssl_init_server_ctx. */ if ((mctx->stapling_enabled == TRUE) && + (ssl_config_global_isfixed(mc) == TRUE) && !ssl_stapling_init_cert(s, mctx, cert)) { ap_log_error(APLOG_MARK, APLOG_ERR, 0, s, APLOGNO(02567) "Unable to configure certificate %s for stapling", @@ -1418,7 +1421,8 @@ * (late) point makes sure that we catch both certificates loaded * via SSLCertificateFile and SSLOpenSSLConfCmd Certificate. */ - if (sc->server->stapling_enabled == TRUE) { + if ((sc->server->stapling_enabled == TRUE) && + (ssl_config_global_isfixed(myModConfig(s)) == TRUE)) { X509 *cert; int i = 0; int ret = SSL_CTX_set_current_cert(sc->server->ssl_ctx, Getting rid of ex_data might be cleaner in the end, and was actually one of Joe's questions on the dev list in October 2009: https://mail-archives.apache.org/mod_mbox/httpd-dev/200910.mbox/%3C20091025200721.GA20714@redhat.com%3E (see also bug 43822) (In reply to Kaspar Brand from comment #10) > One option might be to avoid ex_data fiddling in the "first round", based on > a ssl_config_global_isfixed() check - i.e., something like this (untested): > > Index: ssl_engine_init.c > =================================================================== > --- ssl_engine_init.c (revision 1624017) > +++ ssl_engine_init.c (working copy) > @@ -272,7 +272,9 @@ > return HTTP_INTERNAL_SERVER_ERROR; > } > #ifdef HAVE_OCSP_STAPLING > - ssl_stapling_ex_init(); > + if (ssl_config_global_isfixed(mc) == TRUE) { > + ssl_stapling_ex_init(); > + } > #endif Maybe I am missing something, but we always call ssl_config_global_fix(mc); a few lines above. So the condition would be always true. *** This bug has been marked as a duplicate of bug 54357 *** (In reply to Ruediger Pluem from comment #11) > Maybe I am missing something, but we always call > > ssl_config_global_fix(mc); > > a few lines above. So the condition would be always true. You're absolutely right, my bad. Forget about my idea in comment 10 completely, as it also wouldn't work for restarts. |