Bug 53513

Summary: Race condition / out of order operation in session replication at node startup
Product: Tomcat 6 Reporter: David Johle <djohle>
Component: ClusterAssignee: Tomcat Developers Mailing List <dev>
Severity: major CC: gbalogh
Priority: P2    
Version: 6.0.35   
Target Milestone: default   
Hardware: PC   
OS: Linux   

Description David Johle 2012-07-05 21:56:22 UTC
My configuration:

2 nodes running Tomcat 7.0.26
Using a custom session manager, which extends the DeltaManager

My startInternal() method first calls super.startInternal(), then performs a few additional initializations.

I reviewed the code of DeltaManager.startInternal(), and it calls getAllClusterSessions() which in turn calls waitForSendAllSessions(), which requires either getStateTransfered() to return true, or a timeout.

So by this, I should be able to trust that as the second node starts, the initial sync up of all session data from the first node has completed prior to the startInternal() method exiting (and thus prior to my initializations).

This is, however, not the case!  I can confirm this by repeatedly logging the value of findSessions().length during my inializations, and see that number going up!

There appears to be a race condition between the processing of the message containing the actual session data & the "transfer complete" message.  After tracing this through a little further, I see the stateTransfered is set to true in the handleALL_SESSION_TRANSFERCOMPLETE() callback method.  And that callback is being called PRIOR to the session data itself even being received!

Here is the debug logging output (slightly scrubbed) which shows this out of order messaging:

Jul 5, 2012 4:20:41 PM org.apache.catalina.ha.session.DeltaManager getAllClusterSessions
INFO: Manager [wwwtest#], requesting session state from org.apache.catalina.tribes.membership.MemberImpl[...].
This operation will timeout if no session state has been received within 60 seconds.

Jul 5, 2012 4:20:41 PM org.apache.catalina.ha.session.DeltaManager messageReceived
FINE: Manager [wwwtest#]: Received SessionMessage of type=(SESSION-STATE-TRANSFERED) from [org.apache.catalina.tribes.membership.MemberImpl[...]

Jul 5, 2012 4:20:41 PM org.apache.catalina.ha.session.DeltaManager handleALL_SESSION_TRANSFERCOMPLETE
FINE: Manager [wwwtest#] received from node [[B@6789b939:4,000] session state transfered.

Jul 5, 2012 4:20:41 PM org.apache.catalina.ha.session.DeltaManager messageReceived
FINE: Manager [wwwtest#]: Received SessionMessage of type=(ALL-SESSION-DATA) from [org.apache.catalina.tribes.membership.MemberImpl[...]

Jul 5, 2012 4:20:41 PM org.apache.catalina.ha.session.DeltaManager handleALL_SESSION_DATA
FINE: Manager [wwwtest#]: received session state data
Comment 1 David Johle 2012-07-05 22:25:34 UTC
In case it's helpful, here's the Cluster configuration...fairly basic stuff:

        <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster">
          <Manager  className="my.deltamanager.extension.CustomManager"

          <Channel className="org.apache.catalina.tribes.group.GroupChannel">
            <Membership className="org.apache.catalina.tribes.membership.McastService"
            <Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver"

            <Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
              <Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>

            <Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/>

            <Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/>


          <Valve className="org.apache.catalina.ha.tcp.ReplicationValve"
          <Valve className="org.apache.catalina.ha.session.JvmRouteBinderValve"/>

          <ClusterListener className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListener"/>
          <ClusterListener className="org.apache.catalina.ha.session.ClusterSessionListener"/>

Comment 2 Keiichi Fujino 2012-07-09 08:00:45 UTC
Thanks for the report.

I think there is a problem with the behavior of the DeltaManager.

As you know, DeltaManager is responsible for synchronizing the session on startup.
A node receiving the EVT_GET_ALL_SESSIONS message is to serialize all session, 
and then sends back a EVT_ALL_SESSION_DATA message.
After completing  EVT_ALL_SESSION_DATA message, sends a EVT_ALL_SESSION_TRANSFERCOMPLETE message.

At this time, if channelSendOptions is asynchronous(default),
EVT_ALL_SESSION_DATA message is sent asynchronously.

As a result, will be a race condition between the processing of the message containing 
the actual session data and the "transfer complete" message.

I'm going to fix this behavior.
I intend to make EVT_ALL_SESSION_DATA message always send in synchronous mode.

Anyway the current workaround is to set 6 (sync + ack) to channelSendOptions.

Best Regards.
Comment 3 Keiichi Fujino 2012-07-09 10:49:12 UTC
Fixed in trunk and 7.0.x and will be included in 7.0.30 onwards.
Proposed for 6.0.x.

In this fix, EVT_ALL_SESSION_DATA message is sent in synchronous mode. 
Therefore, it waits for completion of a the message till Sender#timeout (default 3000 milliseconds). 
When timeout occurs while sending the EVT_ALL_SESSION_DATA message, 
you can configure following attributes. 
DeltaManager#sendAllSessions DeltaManager#sendAllSessionsSize 
Comment 4 Mark Thomas 2012-07-11 18:22:12 UTC
Moving to Tomcat 6 since it has been fixed in 7.
Comment 5 Mark Thomas 2012-08-27 22:20:29 UTC
Fixed in 6.0.x and will be included in 6.0.36 onwards.