Hi, We have recently ran into a deadlock situation in a customer site while using websockets. One of the locks is in our code while the other is the Tomcat "stateLock" defined in "class WsSession". I have marked this issue as a regression because one of the 2 locations that Tomcat takes the stateLock in 8.0.33 did not exist in 8.0.29. Here's a quick look at the differences before examining the deadlocked call stacks: In 8.0.33 there's an expansion of the stateLock usage: protected void registerFuture(FutureToSendHandler f2sh) { boolean fail = false; synchronized (stateLock) { // leaving out the details... } } While in 8.0.29 the entire method looks like: protected void registerFuture(FutureToSendHandler f2sh) { futures.put(f2sh, f2sh); } I can understand why the stateLock usage has been expanded. But I think there is a problem with this. Previously we'd only enter stateLock in Tomcat -> Application calls. Now with the new registerFuture we also can enter stateLock in Application -> Tomcat calls. So a recent change seems to have introduced a key ingredient for deadlocks. If my application has its own lock there is the potential for 2 locks to be acquired in reverse order by 2 simultaneous threads. We have experienced a real-world example of this: In the 1st thread Tomcat is closing the session after an abrupt disconnect from the client. The close action bubbles up to the Application onClose handler. "http-nio-8445-exec-76" daemon prio=5 tid=249 BLOCKED at com.myc.common.domain.websocketutil.WsOutbound.setSocketStatus(WsOutbound.java:152) --> Waits on the Application local lock Local Variable: com.myc.common.domain.websocketutil.WsOutbound$SocketStatus#2 at com.myc.customer.web.socket.WebSocketEndpoint.onClose(WebSocketEndpoint.java:224) --> Application onClose handler called which has to do some tidy up Local Variable: javax.websocket.CloseReason#2 Local Variable: com.myc.customer.web.socket.WebSocketEndpoint#4 Local Variable: java.lang.Object#9184 at sun.reflect.GeneratedMethodAccessor250.invoke() at sun.reflect.DelegatingMethodAccessorImpl.invoke() at java.lang.reflect.Method.invoke() at org.apache.tomcat.websocket.pojo.PojoEndpointBase.onClose(PojoEndpointBase.java:107) Local Variable: org.apache.tomcat.websocket.pojo.PojoEndpointServer#6 at org.apache.tomcat.websocket.WsSession.fireEndpointOnClose(WsSession.java:541) Local Variable: org.apache.catalina.loader.WebappClassLoader#2 Local Variable: org.apache.catalina.core.DefaultInstanceManager#3 at org.apache.tomcat.websocket.WsSession.doClose(WsSession.java:490) --> Acquires a “state lock” on Tomcats call to close the session at org.apache.tomcat.websocket.server.WsHttpUpgradeHandler.onError(WsHttpUpgradeHandler.java:150) Local Variable: org.apache.tomcat.websocket.server.WsHttpUpgradeHandler#6 at org.apache.tomcat.websocket.server.WsHttpUpgradeHandler.access$300(WsHttpUpgradeHandler.java:48) In the 2nd thread the Application is simply sending a normal message to Tomcat (unaware the session is closing simultaneously): "defaultEventExecutorGroup-3-1" prio=5 tid=94 BLOCKED at org.apache.tomcat.websocket.WsSession.registerFuture(WsSession.java:650) --> Waits on the Tomcat “stateLock” Local Variable: org.apache.tomcat.websocket.WsSession#6 at org.apache.tomcat.websocket.FutureToSendHandler.get(FutureToSendHandler.java:104) Local Variable: org.apache.tomcat.websocket.FutureToSendHandler#247 Local Variable: java.util.concurrent.TimeUnit$4#1 at org.apache.tomcat.websocket.FutureToSendHandler.get(FutureToSendHandler.java:31) at com.myc.common.domain.websocketutil.WsOutbound.await(WsOutbound.java:255) at com.myc.common.domain.websocketutil.WsOutbound.sendObj(WsOutbound.java:234) at com.myc.common.domain.websocketutil.WsOutbound.send(WsOutbound.java:215) --> Acquires the Application local lock while sending (to guarantee message order when sending multiple messages) Local Variable: com.myc.customer.model.json.notification.NewMessageNotification#3 Our interface to Tomcat (wsOutbound.java) takes the Application lock using class "synchronized" as follows: public synchronized void setSocketStatus(SocketStatus status) { // Does some state change stuff. Can be invoked from onClose() } public synchronized boolean send(Object obj) { // Sends one or more messages to Tomcat } So we had a simple locking strategy. Now I have to come up with a message queuing strategy so that I can send multiple messages, in order, without holding an Application lock. To my mind the root problem is in the existing Tomcat code for doClose(). In other deadlock situations I've been involved with the onus has been on the thread that invokes a callback/listener to exit any locks it holds first: private void doClose(CloseReason closeReasonMessage, CloseReason closeReasonLocal) { // Double-checked locking. OK because state is volatile if (state != State.OPEN) { return; } synchronized (stateLock) { if (state != State.OPEN) { return; } if (log.isDebugEnabled()) { log.debug(sm.getString("wsSession.doClose", id)); } try { wsRemoteEndpoint.setBatchingAllowed(false); } catch (IOException e) { log.warn(sm.getString("wsSession.flushFailOnClose"), e); <-- Exit stateLock first? fireEndpointOnError(e); } state = State.CLOSING; sendCloseMessage(closeReasonMessage); <-- Exit stateLock first? fireEndpointOnClose(closeReasonLocal); <-- Reenter stateLock? state = State.CLOSED; } I took a quick look at the latest 8.0.x (8.0.44) code and it appears the same issue can happen. I'd appreciate any comment on my analysis or improvements in coding our application. regards, Lorcan O'Toole