ASF Bugzilla – Attachment 35050 Details for
Bug 61183
Deadlock occurs while sending to a closing session
Home
|
New
|
Browse
|
Search
|
[?]
|
Reports
|
Help
|
New Account
|
Log In
Remember
[x]
|
Forgot Password
Login:
[x]
Description in txt
Tomcat_deadlock_description.txt (text/plain), 6.50 KB, created by
Lorcan O'Toole
on 2017-06-13 16:33:50 UTC
(
hide
)
Description:
Description in txt
Filename:
MIME Type:
Creator:
Lorcan O'Toole
Created:
2017-06-13 16:33:50 UTC
Size:
6.50 KB
patch
obsolete
>Hi, >We have recently ran into a deadlock situation in a customer site while using websockets. >One of the locks is in our code while the other is the Tomcat "stateLock" defined in "class WsSession". > >I have marked this issue as a regression because one of the 2 locations that Tomcat takes the stateLock in 8.0.33 did not exist in 8.0.29. Here's a quick look at the differences before examining the deadlocked call stacks: > >In 8.0.33 there's an expansion of the stateLock usage: > protected void registerFuture(FutureToSendHandler f2sh) { > boolean fail = false; > synchronized (stateLock) { > // leaving out the details... > } > } > >While in 8.0.29 the entire method looks like: > protected void registerFuture(FutureToSendHandler f2sh) { > futures.put(f2sh, f2sh); > } > >I can understand why the stateLock usage has been expanded. But I think there is a problem with this. Previously we'd only enter stateLock in Tomcat -> Application calls. Now with the new registerFuture we also can enter stateLock in Application -> Tomcat calls. >So a recent change seems to have introduced a key ingredient for deadlocks. If my application has its own lock there is the potential for 2 locks to be acquired in reverse order by 2 simultaneous threads. > >We have experienced a real-world example of this: > >In the 1st thread Tomcat is closing the session after an abrupt disconnect from the client. The close action bubbles up to the Application onClose handler. > >"http-nio-8445-exec-76" daemon prio=5 tid=249 BLOCKED > at com.myc.common.domain.websocketutil.WsOutbound.setSocketStatus(WsOutbound.java:152) --> Waits on the Application local lock > Local Variable: com.myc.common.domain.websocketutil.WsOutbound$SocketStatus#2 > at com.myc.customer.web.socket.WebSocketEndpoint.onClose(WebSocketEndpoint.java:224) --> Application onClose handler called which has to do some tidy up > Local Variable: javax.websocket.CloseReason#2 > Local Variable: com.myc.customer.web.socket.WebSocketEndpoint#4 > Local Variable: java.lang.Object#9184 > at sun.reflect.GeneratedMethodAccessor250.invoke(<unknown string>) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(<unknown string>) > at java.lang.reflect.Method.invoke(<unknown string>) > at org.apache.tomcat.websocket.pojo.PojoEndpointBase.onClose(PojoEndpointBase.java:107) > Local Variable: org.apache.tomcat.websocket.pojo.PojoEndpointServer#6 > at org.apache.tomcat.websocket.WsSession.fireEndpointOnClose(WsSession.java:541) > Local Variable: org.apache.catalina.loader.WebappClassLoader#2 > Local Variable: org.apache.catalina.core.DefaultInstanceManager#3 > at org.apache.tomcat.websocket.WsSession.doClose(WsSession.java:490) --> Acquires a âstate lockâ on Tomcats call to close the session > at org.apache.tomcat.websocket.server.WsHttpUpgradeHandler.onError(WsHttpUpgradeHandler.java:150) > Local Variable: org.apache.tomcat.websocket.server.WsHttpUpgradeHandler#6 > at org.apache.tomcat.websocket.server.WsHttpUpgradeHandler.access$300(WsHttpUpgradeHandler.java:48) > > >In the 2nd thread the Application is simply sending a normal message to Tomcat (unaware the session is closing simultaneously): > >"defaultEventExecutorGroup-3-1" prio=5 tid=94 BLOCKED > at org.apache.tomcat.websocket.WsSession.registerFuture(WsSession.java:650) --> Waits on the Tomcat âstateLockâ > Local Variable: org.apache.tomcat.websocket.WsSession#6 > at org.apache.tomcat.websocket.FutureToSendHandler.get(FutureToSendHandler.java:104) > Local Variable: org.apache.tomcat.websocket.FutureToSendHandler#247 > Local Variable: java.util.concurrent.TimeUnit$4#1 > at org.apache.tomcat.websocket.FutureToSendHandler.get(FutureToSendHandler.java:31) > at com.myc.common.domain.websocketutil.WsOutbound.await(WsOutbound.java:255) > at com.myc.common.domain.websocketutil.WsOutbound.sendObj(WsOutbound.java:234) > at com.myc.common.domain.websocketutil.WsOutbound.send(WsOutbound.java:215) --> Acquires the Application local lock while sending (to guarantee message order when sending multiple messages) > Local Variable: com.myc.customer.model.json.notification.NewMessageNotification#3 > >Our interface to Tomcat (wsOutbound.java) takes the Application lock using class "synchronized" as follows: > > public synchronized void setSocketStatus(SocketStatus status) { > // Does some state change stuff. Can be invoked from onClose() > } > > public synchronized boolean send(Object obj) { > // Sends one or more messages to Tomcat > } > >So we had a simple locking strategy. Now I have to come up with a message queuing strategy so that I can send multiple messages, in order, without holding an Application lock. > >To my mind the root problem is in the existing Tomcat code for doClose(). In other deadlock situations I've been involved with the onus has been on the thread that invokes a callback/listener to exit any locks it holds first: > > private void doClose(CloseReason closeReasonMessage, > CloseReason closeReasonLocal) { > // Double-checked locking. OK because state is volatile > if (state != State.OPEN) { > return; > } > > synchronized (stateLock) { > if (state != State.OPEN) { > return; > } > > if (log.isDebugEnabled()) { > log.debug(sm.getString("wsSession.doClose", id)); > } > try { > wsRemoteEndpoint.setBatchingAllowed(false); > } catch (IOException e) { > log.warn(sm.getString("wsSession.flushFailOnClose"), e); > <-- Exit stateLock first? > fireEndpointOnError(e); > } > > state = State.CLOSING; > > sendCloseMessage(closeReasonMessage); > <-- Exit stateLock first? > fireEndpointOnClose(closeReasonLocal); > <-- Reenter stateLock? > state = State.CLOSED; > } > >I took a quick look at the latest 8.0.x (8.0.44) code and it appears the same issue can happen. >I'd appreciate any comment on my analysis or improvements in coding our application. > >regards, >Lorcan O'Toole
You cannot view the attachment while viewing its details because your browser does not support IFRAMEs.
View the attachment on a separate page
.
View Attachment As Raw
Actions:
View
Attachments on
bug 61183
: 35050