Created attachment 27372 [details] Fix (patch), problem description, test case source code. == Summary: Session replication fails with ClassNotFoundException when session attribute is Java dynamic proxy (java.lang.reflect.Proxy). == Description: In my application I'm storing serializable object in HTTP session attribute. One of object references is (serializable) Java dynamic proxy (java.lang.reflect.Proxy). Object is properly serializable (including proxy) but HTTP session fails to migrate between cluster nodes. Receiving node throws ClassNotFoundException on session deserialization. This bug is reproducible if session attribute is dynamic proxy itself, not only as part of serialized object graph. == Cause Tomcat for session migration uses specialized ObjectInputStream implementation: org.apache.catalina.cluster.session.ReplicationStream. This class properly overrides #resolveClass(ObjectStreamClass) method to resolve deserialized classes using web application class loader. However deserializing dynamic proxy calls #resolveProxyClass(String[]) method, not #resolveClass(ObjectStreamClass) method. Default implementation of #resolveProxyClass(String[]) in ObjectStreamClass can't see web application classloader that was used to load interfaces used by dynamic proxy and deserialization fails with ClassNotFoundException. == Solution ReplicationStream#resolveProxyClass(String[]) method should be overridden in the same way that #resolveClass(ObjectStreamClass) was. This problem was already fixed in Tomcat 6 and Tomcat 7. Attached patch is modified solution taken from Tomcat 6.0.32. == Test case (See attached sample web application for source code, configurations and build scripts) I've got session counter (Counter interface) with two implementations. One is direct implementation (CounterImpl) the other is dynamic proxy that uses invocation handler (CounterInvocationHandler) that delegates all proxy methods to the instance of direct implementation. There are two servlets with single implementation (CounterServlet), mapped two different uris. First (mapped to /counter/) creates session that uses direct implementation, second (mapped to /counter-with-proxy/) used dynamic proxy implementation. When application is run in the cluster restarting all servers and accessing application via /counter/ URL makes session migrate properly. When application is accessed via /counter-with-proxy/ on receiving node ClassNotFoundException is throw: 2011-08-11 09:32:18 org.apache.catalina.cluster.session.SimpleTcpReplicationManager readSession SEVERE: Failed to deserialize the session! java.lang.ClassNotFoundException: test.Counter at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at java.io.ObjectInputStream.resolveProxyClass(ObjectInputStream.java:675) at java.io.ObjectInputStream.readProxyDesc(ObjectInputStream.java:1530) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1492) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1731) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350) at org.apache.catalina.session.StandardSession.readObject(StandardSession.java:1459) at org.apache.catalina.session.StandardSession.readObjectData(StandardSession.java:983) at org.apache.catalina.cluster.session.ReplicatedSession.readObjectData(ReplicatedSession.java:172) at org.apache.catalina.cluster.session.SimpleTcpReplicationManager.readSession(SimpleTcpReplicationManager.java:399) at org.apache.catalina.cluster.session.SimpleTcpReplicationManager.messageReceived(SimpleTcpReplicationManager.java:583) at org.apache.catalina.cluster.session.SimpleTcpReplicationManager.messageDataReceived(SimpleTcpReplicationManager.java:622) at org.apache.catalina.cluster.session.ClusterSessionListener.messageReceived(ClusterSessionListener.java:86) at org.apache.catalina.cluster.tcp.SimpleTcpCluster.receive(SimpleTcpCluster.java:1175) at org.apache.catalina.cluster.tcp.ClusterReceiverBase.messageDataReceived(ClusterReceiverBase.java:598) at org.apache.catalina.cluster.io.ObjectReader.execute(ObjectReader.java:108) at org.apache.catalina.cluster.tcp.TcpReplicationThread.drainChannel(TcpReplicationThread.java:139) at org.apache.catalina.cluster.tcp.TcpReplicationThread.run(TcpReplicationThread.java:70) == Steps to reproduce ==== Environment: Linux Ubuntu: $ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 10.04.3 LTS Release: 10.04 Codename: lucid Java (Sun JDK 1.6): $ /usr/lib/jvm/java-6-sun/bin/java -version java version "1.6.0_26" Java(TM) SE Runtime Environment (build 1.6.0_26-b03) Java HotSpot(TM) Server VM (build 20.1-b02, mixed mode) ==== Cluster setup 1. Choose cluster virtual host name and setup it in DNS or /etc/hosts file. 2. Change server names appropriately in provided configuration files: - n1/conf/server.xml (cluster node 1 config): - <Engine> defaultHost - <Host> name - <Cluster> clusterName - n2/conf/server.xml (cluster node 2 config): - <Engine> defaultHost - <Host> name - <Cluster> clusterName - tomcat-cluster.tskutnik.lan.e-point.pl (Apache2 site definition): - ErrorLog location for <VirtualHost: *:80> and <VirtualHost: *:443> - CustomLog location for <VirtualHost: *:80> and <VirtualHost: *:443> - workers.properties - worker.servlet-cluster-n1.host - worker.servlet-cluster-n2.host 3. Install modified tomcat-cluster.tskutnik.lan.e-point.pl file as Apache2 site definition (e.g. by putting it in /etc/apache2/sites-enabled/) 4. Install modified workers.properties as mod_jk configuration for Apache2 (e.g. by putting it in /etc/libapache2-mod-jk/) 5. Create two node cluster. Cluster configuration is provided in n1/conf/server.xml and n2/conf/server.xml file. 5.1. Unpack Tomcat 5.5.33 archive in two identical, co-located directories: $ mkdir -p ~/work/tomcat-clusters/tomcat-testcase/ $ cd ~/work/tomcat-clusters/tomcat-testcase/ $ tar zxf ~/Downloads/apache-tomcat-5.5.33.tar.gz $ cp -r apache-tomcat-5.5.33 n1 $ mv apache-tomcat-5.5.33 n2 5.2. Install modified configuration files: $ cp n1/conf/server.xml ~/work/tomcat-clusters/tomcat-testcase/n1/conf $ cp n2/conf/server.xml ~/work/tomcat-clusters/tomcat-testcase/n2/conf 6. Build sample application: $ ./war.sh This should build WAR archive placed in file /tmp/et.war. Cluster nodes use this file to load test case application. If you want to put this file somewhere else you should modify your server.xml files accordingly. 7. Restart Apache2 and verify that virtual hosts properly forwards HTTP request to Tomcat Cluster. ==== Reproducing bug 1. Start both cluster nodes: $ cd ~/work/tomcat-clusters/tomcat-testcase/n1 $ ./bin/startup.sh $ cd ~/work/tomcat-clusters/tomcat-testcase/n2 $ ./bin/startup.sh 2. Access application URL via browser, e.g.: http://tomcat-cluster.tskutnik.lan.e-point.pl/counter-with-proxy/ Use virtual hosts name that you used in step 1. Browser should display page similar to: -------------------------------------------------- Requested Session Id: DD0C9C7681453FD12BDFD7F5380D10BC.servlet-cluster-n1 Session Id: C2C45A80C69D26D6C1CA5E2800F86306.servlet-cluster-n1 Counter: test.CounterImpl@d16fc1; Counter: 2 on server: n1 Using proxy: true Counter serialized+deserialized: test.CounterImpl@94257f Server name: n1 -------------------------------------------------- Counter increases by 1 on every request. Using proxy determines if counter implementation is dynamic proxy. Counter serialized+deserialized demonstrates that counter is serializable. Server name (n1 above) determines main node to which HTTP requests are directed from Apache2 web server instance. If main node is n1 than receiver node is n2. 3. Examine catalina.out log on receiver node. You should see ClassNotFoundException, e.g: -------------------------------------------------- 2011-08-11 09:32:18 org.apache.catalina.cluster.session.SimpleTcpReplicationManager readSession SEVERE: Failed to deserialize the session! java.lang.ClassNotFoundException: test.Counter at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at java.io.ObjectInputStream.resolveProxyClass(ObjectInputStream.java:675) at java.io.ObjectInputStream.readProxyDesc(ObjectInputStream.java:1530) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1492) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1731) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350) at org.apache.catalina.session.StandardSession.readObject(StandardSession.java:1459) at org.apache.catalina.session.StandardSession.readObjectData(StandardSession.java:983) at org.apache.catalina.cluster.session.ReplicatedSession.readObjectData(ReplicatedSession.java:172) at org.apache.catalina.cluster.session.SimpleTcpReplicationManager.readSession(SimpleTcpReplicationManager.java:399) at org.apache.catalina.cluster.session.SimpleTcpReplicationManager.messageReceived(SimpleTcpReplicationManager.java:583) at org.apache.catalina.cluster.session.SimpleTcpReplicationManager.messageDataReceived(SimpleTcpReplicationManager.java:622) at org.apache.catalina.cluster.session.ClusterSessionListener.messageReceived(ClusterSessionListener.java:86) at org.apache.catalina.cluster.tcp.SimpleTcpCluster.receive(SimpleTcpCluster.java:1175) at org.apache.catalina.cluster.tcp.ClusterReceiverBase.messageDataReceived(ClusterReceiverBase.java:598) at org.apache.catalina.cluster.io.ObjectReader.execute(ObjectReader.java:108) at org.apache.catalina.cluster.tcp.TcpReplicationThread.drainChannel(TcpReplicationThread.java:139) at org.apache.catalina.cluster.tcp.TcpReplicationThread.run(TcpReplicationThread.java:70) -------------------------------------------------- Killing main node with "kill -9" demonstrates that session DID NOT migrate properly to node2 and counter state WAS LOST. 4. You may restart both cluster nodes and try accessing application using /counter/ URI that does not exhibit this behavior. Killing main node with "kill -9" demonstrates that session DID migrate properly to node2 and counter state WAS NOT LOST. == Verifying fix I did not compile Tomcat from scratch to verify my solution. I've only recompiled single class (ReplicationStream), packaged it in JAR archive and "patched' Tomcat by putting this JAR in common/lib directory: $ mkdir bin $ javac -classpath </path/to/binary>/server/lib/catalina.jar ./apache-tomcat-5.5.33-src-fix/container/modules/cluster/src/share/org/apache/catalina/cluster/session/ReplicationStream.java" -d bin $ jar cf catalina-proxy-serialization-fix.jar -C bin . $ cp catalina-proxy-serialization-fix.jar </path/to/binary/>/common/lib These steps fixed the bug.
Created attachment 27375 [details] Expanded patch This expanded patch covers both cluster implementations in 5.5.x and slightly simplifies the proposed patch.
Patch has been applied to 5.5.x and will be included in 5.5.34 onwards.