We're using Tomcat in our product and facing issues with a few HTTP requests under traffic. Basically what happens is that incoming HTTP connection is terminated by Tomcat (TCP FIN or TCP RST packet is sent) right after HTTPS handshake is finished. Each traffic iteration consists of the following: 1) 2000 Websocket connections used to pass data between the clients. 2) Some minor supporting HTTP traffic. In the end of traffic iteration Websocket sessions are closing and, at the same time, some supporting HTTP traffic is processed by Tomcat. I was adding various logs to the Tomcat code and figured out that Tomcat attempts to process these failing HTTP requests with Upgrade processors (org.apache.coyote.http11.upgrade.NioProcessor). I added the following logs: 1) AbstractProtocol.AbstractConnectionHandler.process right after processor is identified: if (processor.isUpgrade()) { getLog().info("AbstractConnectionHandler.process got processor from " + processorSource + ". status = " + status.name() + ", processor.id = " + processor.getId()); } 2) Http11NioProtocol.Http11ConnectionHandler.release (both, right before processor is recycled) if (processor.getId() == -2) { getLog().info("Recycling upgrade processor in HTTP protocol.\n", new Exception()); } note that I adde method getId to all processors. For regular HTTP processors it equals to request ID (which I also added). For upgrade processor (org.apache.coyote.http11.upgrade.NioProcessor) ID always equal to -2. and here is the output I got right before failed HTTP request: Apr 08, 2016 CDT 08:26:09.686 AM http-nio-135.60.87.90-443-ClientPoller-1 org.apache.coyote.http11.Http11NioProtocol INFO: Recycling upgrade processor in HTTP protocol. java.lang.Exception at org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.release(Http11NioProtocol.java:200) at org.apache.tomcat.util.net.NioEndpoint$Poller.cancelledKey(NioEndpoint.java:953) at org.apache.tomcat.util.net.NioEndpoint$Poller.processKey(NioEndpoint.java:1135) at org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:1071) at java.lang.Thread.run(Thread.java:745) .... Apr 08, 2016 CDT 08:26:09.710 AM http-nio-135.60.87.90-443-exec-106 org.apache.coyote.http11.Http11NioProtocol INFO: AbstractConnectionHandler.process got processor from recycledProcessors map. status = OPEN_READ, processor.id = -2 So, what happens is - Poller adds NioProcessor to recycledProcessors stack of Http11NioProtocol.Http11ConnectionHandler and next HTTP request to Tomcat which is coming in is failing because Tomcat attempts to process it with NioProcessor instead of regular Http11NioProcessor. If any additional logs are required to reproduce the issue - please let me know. It seems like during normal operation such NioProcessors are just removed from commections map, for example here in AbstractProtocol.AbstractConnectionHandler.process method: // Connection closed. OK to recycle the processor. Upgrade // processors are not recycled. connections.remove(socket); if (processor.isUpgrade()) { ... and are never expected to be recycled. And in case CancelledKeyException is thrown in Poller there's no check for upgrade processors and upgrade processor can be recycled.
This need to be verified, but I do notice some upgraded processors going into this code path when running the websockets portion of the testsuite. I added an extra check based on the upgraded flag.
Does that mean that you have a fix for this issue? Looking at the code it's not clean if it's enough to just skip recycling such Upgrade connectors or it's also needed to execute the code similar to the one executed in normal scenario: UpgradeToken upgradeToken = processor.getUpgradeToken(); HttpUpgradeHandler httpUpgradeHandler = upgradeToken.getHttpUpgradeHandler(); InstanceManager instanceManager = upgradeToken.getInstanceManager(); if (instanceManager == null) { httpUpgradeHandler.destroy(); } else { ClassLoader oldCL = upgradeToken.getContextBind().bind(false, null); try { httpUpgradeHandler.destroy(); instanceManager.destroyInstance(httpUpgradeHandler); } finally { upgradeToken.getContextBind().unbind(false, oldCL); } }
If you have a patch I can apply it, build tomcat and test the fix.
Please check http://svn.apache.org/viewvc?view=revision&revision=1738261
The change you mentioned is for trunk. I'm using 8.0.33, and it's quite different. For example it does not have implemented release() method in AbstractProtocol class. I applied similar change to Http11NioProtocol class and will retest. But if you can provide actual diff for 8.0 (which will be eventually released) - it would be much better as I'll be able to make sure this change is actually fixing an issue.
Even 7 had the issue when adding a debug in the right release method and using the websocket portion of the testsuite. Oops.
Remy's patches should have fixed this.
The fix will be in 9M5, 8.5.1, 8.0.34 and 7.0.70.
Applied the patch and tested - issue is fixed.