To replay it, set the following configuration entries in the httpd.conf: LoadModule authn_core_module modules/mod_authn_core.so <AuthnProviderAlias file file1> AuthName "dfdf" </AuthnProviderAlias> Start server and you will see the segmentation fault. I don't quite understand the problem. I put some printf() in the invoke_cmd() function. It seems that the segfault occurs when it's executing the AuthName directive. The code reaches “return cmd->AP_TAKE1(parms, mconfig, w);” but does not reach the handler function of the AuthName directive -- set_authname(). Please check it. Thanks a lot!!!
Oh, sorry, I missed sth in the previous email. To replay it, using the following configurations (have to load both modules): LoadModule authn_core_module modules/mod_authn_core.so LoadModule auth_digest_module modules/mod_auth_digest.so <AuthnProviderAlias file file1> AuthName "dfdf" </AuthnProviderAlias> It seems the two module has some conflicts? (In reply to comment #0) > To replay it, set the following configuration entries in the httpd.conf: > > LoadModule authn_core_module modules/mod_authn_core.so > <AuthnProviderAlias file file1> > AuthName "dfdf" > </AuthnProviderAlias> > > Start server and you will see the segmentation fault. > > I don't quite understand the problem. > > I put some printf() in the invoke_cmd() function. It seems that the segfault > occurs when it's executing the AuthName directive. The code reaches “return > cmd->AP_TAKE1(parms, mconfig, w);” but does not reach the handler function of > the AuthName directive -- set_authname(). > > Please check it. > > Thanks a lot!!!
It seems AuthnProviderAlias breaks some assumption in create_digest_dir_config(). The crash does not happen if I remove these lines: --- a/modules/aaa/mod_auth_digest.c +++ b/modules/aaa/mod_auth_digest.c @@ -454,10 +454,6 @@ static void *create_digest_dir_config(apr_pool_t *p, char *dir) { digest_config_rec *conf; - if (dir == NULL) { - return NULL; - } - conf = (digest_config_rec *) apr_pcalloc(p, sizeof(digest_config_rec)); if (conf) { conf->qop_list = apr_palloc(p, sizeof(char*)); I haven't tested if this makes AuthnProviderAlias actually work, though. Can you try it?
(In reply to comment #2) > It seems AuthnProviderAlias breaks some assumption in > create_digest_dir_config(). The crash does not happen if I remove these lines: > > --- a/modules/aaa/mod_auth_digest.c > +++ b/modules/aaa/mod_auth_digest.c > @@ -454,10 +454,6 @@ static void *create_digest_dir_config(apr_pool_t *p, char > *dir) > { > digest_config_rec *conf; > > - if (dir == NULL) { > - return NULL; > - } > - > conf = (digest_config_rec *) apr_pcalloc(p, sizeof(digest_config_rec)); > if (conf) { > conf->qop_list = apr_palloc(p, sizeof(char*)); > > > I haven't tested if this makes AuthnProviderAlias actually work, though. Can > you try it? Yes, I tried. Now there's no segfault any more. But actually directives like AuthName and AuthType has no effect in the <AuthnProviderAlias> block.
(In reply to comment #2) > It seems AuthnProviderAlias breaks some assumption in > create_digest_dir_config(). The crash does not happen if I remove these lines: > > --- a/modules/aaa/mod_auth_digest.c > +++ b/modules/aaa/mod_auth_digest.c > @@ -454,10 +454,6 @@ static void *create_digest_dir_config(apr_pool_t *p, char > *dir) > { > digest_config_rec *conf; > > - if (dir == NULL) { > - return NULL; > - } > - > conf = (digest_config_rec *) apr_pcalloc(p, sizeof(digest_config_rec)); > if (conf) { > conf->qop_list = apr_palloc(p, sizeof(char*)); > > > I haven't tested if this makes AuthnProviderAlias actually work, though. Can > you try it? by the way, could you also explain a little bit about the problem? thanks a lot!
mod_auth_digest tries to avoid allocating memory for its own config struct in global server context because AuthDigestShmemSize, which is its only directive allowed in that context, doesn't need the struct. This optimization breaks with AuthnProviderAlias. I don't know yet if the correct fix is to make AuthnProviderAlias simulate per-directory context, or if mod_auth_digest should be changed to either not make that optimization, or to detect global server context in a different way. Also, I am not familiar enough with AuthnProviderAlias to say if it should support AuthName and AuthType. If yes, then this is probably a different bug than the segfault. If no, AuthnProviderAlias should log an error if these directives are used. Maybe someone more familiar with AuthnProviderAlias could comment?
(In reply to comment #5) > mod_auth_digest tries to avoid allocating memory for its own config struct in > global server context because AuthDigestShmemSize, which is its only directive > allowed in that context, doesn't need the struct. This optimization breaks with > AuthnProviderAlias. > > I don't know yet if the correct fix is to make AuthnProviderAlias simulate > per-directory context, or if mod_auth_digest should be changed to either not > make that optimization, or to detect global server context in a different way. > Vielen Dank, Stefan! I will take a look at this issue. Your information is helpful. > Also, I am not familiar enough with AuthnProviderAlias to say if it should > support AuthName and AuthType. If yes, then this is probably a different bug > than the segfault. If no, AuthnProviderAlias should log an error if these > directives are used. Maybe someone more familiar with AuthnProviderAlias could > comment? Hmmm... this should not be a big thing. There are already too many silent behavior in current Apache :P
Seems to have reproduced the crash with the following configuration . <AuthnProviderAlias ldap world_company > AuthName "LDAP_world_company" AuthLDAPBindDN "CN=xxx xxx,OU=yyy,OU=zzz,OU=People,DC=company,DC=world" AuthLDAPBindPassword "c0ma!" AuthLDAPURL ldap://*****:389/****?sAMAccountName Require valid-user </AuthnProviderAlias> I hardly managed to get the following stack trace: #0 0x00007ffff7c00a30 in set_realm (cmd=<optimized out>, config=0x0, realm=0x7ffff43485b8 "LDAP_world_company") at mod_auth_digest.c:493 #1 0x00005555555ae3e2 in invoke_cmd (cmd=0x7ffff7c07a00 <digest_cmds>, parms=parms@entry=0x7fffffffd030, mconfig=0x0, args=<optimized out>) at config.c:928 #2 0x00005555555b0a69 in ap_walk_config_sub (section_vector=0x7ffff4348410, parms=0x7fffffffd030, current=0x7ffff436b398) at config.c:1339 #3 ap_walk_config (current=0x7ffff436b398, parms=parms@entry=0x7fffffffd030, section_vector=section_vector@entry=0x7ffff4348410) at config.c:1372 #4 0x00007ffff7bf876f in authaliassection (cmd=0x7fffffffd030, mconfig=<optimized out>, arg=0x7ffff436b380 "ldap world_company >") at mod_authn_core.c:257 #5 0x00005555555ae2af in invoke_cmd (cmd=0x7ffff7bfac90 <authn_cmds+80>, parms=parms@entry=0x7fffffffd030, mconfig=0x7ffff7bfd448, args=<optimized out>) at config.c:895 #6 0x00005555555b0a69 in ap_walk_config_sub (section_vector=0x7ffff7c25540, parms=0x7fffffffd030, current=0x7ffff436b338) at config.c:1339 #7 ap_walk_config (current=0x7ffff436b338, parms=parms@entry=0x7fffffffd030, section_vector=0x7ffff7c25540) at config.c:1372 #8 0x00005555555b1ec5 in ap_process_config_tree (s=<optimized out>, conftree=<optimized out>, p=0x7ffff7fc6028, ptemp=<optimized out>) at config.c:2156 #9 0x000055555558abfa in main (argc=<optimized out>, argv=<optimized out>) at main.c:686 Vars at #3: (gdb) info args current = 0x7ffff436b340 parms = 0x7fffffffd030 section_vector = 0x7ffff4348400 (gdb) print *current $9 = { directive = 0x7ffff7bf9090 "AuthName", args = 0x7ffff436b388 "\"LDAP_world_company\"", next = 0x7ffff436b398, first_child = 0x0, parent = 0x7ffff436b2e0, data = 0x0, filename = 0x7ffff436b058 "/etc/apache2/sites-enabled/world-company-site.conf", line_num = 25, last = 0x0 } (gdb) print *parms $10 = { info = 0x0, override = 72, override_opts = 239, override_list = 0x0, limited = -1, limited_xmethods = 0x0, xlimited = 0x0, config_file = 0x0, directive = 0x7ffff436b340, pool = 0x7ffff7fc6028, temp_pool = 0x7ffff7c26028, server = 0x7ffff7c28ac0, path = 0x0, cmd = 0x7ffff7c07a00 <digest_cmds>, context = 0x7ffff4348400, err_directive = 0x0 } Server version: Apache/2.4.34 (Ubuntu) Server built: 2018-10-03T13:57:22
Apache crash was first noticed in 2.4.41 .Error stack looked similar to the bug which was fixed in 2.4.43 but still the issue was seen in 2.4.43.We made system level memory setting changes even after that we could see that the issue was happening . # vi rc.local # (Add the lines below to the end. Replace eth0, eth1 with the actual names) # sysctl -w net.core.rmem_max=16777216 # sysctl -w net.core.wmem_max=16777216 # sysctl -w net.ipv4.tcp_rmem="4096 87380 16777216" # sysctl -w net.ipv4.tcp_wmem="4096 65536 16777216" # sysctl -w net.ipv4.tcp_fin_timeout=10 # ifconfig eth0 txqueuelen 10000 # ifconfig eth2 txqueuelen 10000 # reboot 14) Configure new ulimit settings # pbrun su-root # cd /etc/security # vi limits.conf # (Add these lines to the end) # root hard nofile 16384 # root soft nofile 16384 # apache hard nofile 16384 # apache soft nofile 16384 # :wq # reboot We suspected that the load on the server is more which may be due increase in the number of application users . We increased the apache mpm values and since we are using worker settings which deals with increase load on the server with respect to the increased connection . /apps/hrs/tt/bin/ ./apachectl -V |grep MPM Server MPM: worker Server limit set was 300 .We are increasing the value to 600. Start servers from 200 to 500 and Max client to 15000 Minspare and Maxspare thread values are changed as per the recommendation(75 and 250) from the apache documentation. ServerLimit 600 StartServers 500 MaxClients 15000 MinSpareThreads 75 (recommended value) MaxSpareThreads 250 (recommended value) ThreadsPerChild 25 MaxRequestsPerChild 0 But the issue was still seen with below exception post implementing the change I was researching on this issue again and fortunately got the source code for apache . Please find the source code link for apache. people.apache.org/~igalic/checks/httpd/2012-09-14-1/report-69Y1He.html After the below exception Server gets hung and cannot serve any request. [Fri Aug 07 16:12:12.684173 2020] [mpm_event:debug] [pid 19580:tid 139740527363840] event.c(1810): Too many open connections (25), not accepting new conns in this process This exception is seen from the time we enabled debugs . [Fri Aug 07 16:12:12.889710 2020] [mpm_event:debug] [pid 19609:tid 139740820498176] event.c(2314): AH02471: start_threads: Using epoll (wakeable) [Fri Aug 07 16:12:12.889934 2020] [mpm_event:debug] [pid 19610:tid 139740820498176] event.c(2314): AH02471: start_threads: Using epoll (wakeable) Lines in the source code referring to the exception 1805 else if (connections_above_limit()) { 1806 disable_listensocks(); 1807 ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, ap_server_conf, 1808 "Too many open connections (%u), " 1809 "not accepting new conns in this process", 1810 apr_atomic_read32(&connection_count)); 1811 ap_log_error(APLOG_MARK, APLOG_TRACE1, 0, ap_server_conf, 1812 "Idle workers: %u", This %u refers to the threads per child value in mpm which we is the default recommended value but looks like that request on server is high where there is lack of this parameter . Increased the thread per child to 50 and made other parameter changes to match the value .but still the issue was seen . From ServerLimit 600 StartServers 500 MaxClients 15000 MinSpareThreads 75 MaxSpareThreads 250 ThreadsPerChild 25 New value : ServerLimit 300 StartServers 200 MaxClients 15000 MinSpareThreads 75 MaxSpareThreads 250 ThreadsPerChild 50 we were getting other exception post the changes as below AH00486: server seems busy, (you may need to increase StartServers, ThreadsPerChild or Min/MaxSpareThreads), spawning 8 children, there are around 54 idle threads, 6 active children, and 6 children that are shutting down. Made the threads per child to 40 and min and max spare s 25 and 75 respectively with the semaphore changes at the system level but still the issue is seen Changes made on the server level # 17) Increase semaphore limits # pbrun su-root # vi /etc/sysctl.conf # Add this line: # # Add additional semaphores for Channel Secure and mod_rewrite # kernel.sem = 4096 512000 1600 9000 # :wq # sysctl -p /etc/sysctl.conf What I notice is webagent fails to initialise all the time and the server loses the connection ( no connections hit from F5 or the server does not take u the connection which like very low count of user when netstat performed say 25 and then the apacheURL goes down for the individual apache server though apache process is up .