Bug 65411 - NamingException in JDNIRealm.getPrincipal(String username, GSSCredential gssCredential) ends in locked-up realm
Summary: NamingException in JDNIRealm.getPrincipal(String username, GSSCredential gssC...
Status: RESOLVED FIXED
Alias: None
Product: Tomcat 9
Classification: Unclassified
Component: Catalina (show other bugs)
Version: 9.0.39
Hardware: All All
: P2 regression (vote)
Target Milestone: -----
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-06-30 11:47 UTC by Ole Ostergaard
Modified: 2021-06-30 13:28 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ole Ostergaard 2021-06-30 11:47:47 UTC
A NamingException during JDNIRealm.getPrincipal(String username, GSSCredential gssCredential) leads to a locked up realm. Which leads to tomcat locking up.

Similar to the fix for Bug 65033 (https://bz.apache.org/bugzilla/show_bug.cgi?id=65033), the second catching of the NamingException should also close the connection and unlock the singleConnectionLock.

It seems that the connection used to be closed, but with 9.0.39 and this commit https://github.com/apache/tomcat/commit/95658dfd868216db0773c38aad8eebf544024b09?branch=95658dfd868216db0773c38aad8eebf544024b09 the close(connection) is not called anymore.

Corresponding PR: https://github.com/apache/tomcat/pull/430
Comment 1 Remy Maucherat 2021-06-30 12:17:56 UTC
Well, that was quite intentional, since the regular NamingException is nearly always when connecting, so no connection to close (it will be null). Can you give the full stack trace ?
Comment 2 Ole Ostergaard 2021-06-30 12:42:08 UTC
That is true, but the lock still does not get released.

This is the stack trace that is causing the lock (Tomcat 9.0.45):

30-Jun-2021 03:45:35.211 SEVERE [https-jsse-nio2-8443-exec-154] org.apache.catalina.realm.JNDIRealm.getPrincipal Exception performing authentication
	javax.naming.NamingException: LDAP response read timed out, timeout used: 5000 ms.; remaining name 'ou=people, dc=knime, dc=com'
		at com.sun.jndi.ldap.LdapRequest.getReplyBer(LdapRequest.java:129)
		at com.sun.jndi.ldap.Connection.readReply(Connection.java:469)
		at com.sun.jndi.ldap.LdapClient.getSearchReply(LdapClient.java:638)
		at com.sun.jndi.ldap.LdapClient.search(LdapClient.java:561)
		at com.sun.jndi.ldap.LdapCtx.doSearch(LdapCtx.java:2013)
		at com.sun.jndi.ldap.LdapCtx.searchAux(LdapCtx.java:1872)
		at com.sun.jndi.ldap.LdapCtx.c_search(LdapCtx.java:1797)
		at com.sun.jndi.toolkit.ctx.ComponentDirContext.p_search(ComponentDirContext.java:392)
		at com.sun.jndi.toolkit.ctx.PartialCompositeDirContext.search(PartialCompositeDirContext.java:358)
		at com.sun.jndi.toolkit.ctx.PartialCompositeDirContext.search(PartialCompositeDirContext.java:341)
		at javax.naming.directory.InitialDirContext.search(InitialDirContext.java:267)
		at org.apache.catalina.realm.JNDIRealm.getUserBySearch(JNDIRealm.java:1698)
		at org.apache.catalina.realm.JNDIRealm.getUser(JNDIRealm.java:1535)
		at org.apache.catalina.realm.JNDIRealm.getUser(JNDIRealm.java:1463)
		at org.apache.catalina.realm.JNDIRealm.getPrincipal(JNDIRealm.java:2418)
		at org.apache.catalina.realm.JNDIRealm.getPrincipal(JNDIRealm.java:2347)
		at org.apache.catalina.realm.JNDIRealm.getPrincipal(JNDIRealm.java:2311)
		at org.apache.catalina.realm.RealmBase.authenticate(RealmBase.java:312)
		at org.apache.catalina.realm.CombinedRealm.authenticate(CombinedRealm.java:154)
...

And this is the JStack that we took when we realised something was off (~150 waiting threads, and no thread that holds the lock as far as we can see)):

"https-jsse-nio2-8443-exec-326" #7207 daemon prio=5 os_prio=0 tid=0x00007fa9bc139800 nid=0x630d waiting on condition [0x00007fa93003f000]
   java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000004401dd5a8> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at org.apache.catalina.realm.JNDIRealm.get(JNDIRealm.java:2469)
	at org.apache.catalina.realm.JNDIRealm.authenticate(JNDIRealm.java:1302)
	at org.apache.catalina.realm.CombinedRealm.authenticate(CombinedRealm.java:191)
	at org.apache.catalina.realm.LockOutRealm.authenticate(LockOutRealm.java:154)
	at org.apache.catalina.authenticator.BasicAuthenticator.doAuthenticate(BasicAuthenticator.java:101)
	at org.apache.catalina.authenticator.AuthenticatorBase.authenticate(AuthenticatorBase.java:740)
...

The get() in getPrincipal(String username, GSSCredential gssCredential) sets the singleConnectionLock.lock(), which never gets unlocked. (https://github.com/apache/tomcat/blob/main/java/org/apache/catalina/realm/JNDIRealm.java#L2280)

The close(connection), not only closes the connection, but also unlocks the lock (singleConnectionLock.unlock()). (https://github.com/apache/tomcat/blob/main/java/org/apache/catalina/realm/JNDIRealm.java#L2143-L2148)
Comment 3 Remy Maucherat 2021-06-30 13:28:27 UTC
Given the trace it is completely reasonable. This will be in 10.1.0-M3, 10.0.9, 9.0.51, 8.5.70.