Bug 59289 - Upgrade processor is put in recycledProcessors stack of Http11NioProtocol.Http11ConnectionHandler
Summary: Upgrade processor is put in recycledProcessors stack of Http11NioProtocol.Htt...
Status: RESOLVED FIXED
Alias: None
Product: Tomcat 8
Classification: Unclassified
Component: Connectors (show other bugs)
Version: 8.0.33
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: ----
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-04-08 14:23 UTC by Alexander Malyshev
Modified: 2016-04-14 15:26 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Alexander Malyshev 2016-04-08 14:23:10 UTC
We're using Tomcat in our product and facing issues with a few HTTP requests under traffic. Basically what happens is that incoming HTTP connection is terminated by Tomcat (TCP FIN or TCP RST packet is sent) right after HTTPS handshake is finished.

Each traffic iteration consists of the following:

1) 2000 Websocket connections used to pass data between the clients.
2) Some minor supporting HTTP traffic.

In the end of traffic iteration Websocket sessions are closing and, at the same time, some supporting HTTP traffic is processed by Tomcat. I was adding various logs to the Tomcat code and figured out that Tomcat attempts to process these failing HTTP requests with Upgrade processors (org.apache.coyote.http11.upgrade.NioProcessor).

I added the following logs:
1) AbstractProtocol.AbstractConnectionHandler.process right after processor is identified:

                if (processor.isUpgrade()) {
                   getLog().info("AbstractConnectionHandler.process got processor from " + processorSource + 
                         ". status = " + status.name() + 
                         ", processor.id = " + processor.getId());
                }

2) Http11NioProtocol.Http11ConnectionHandler.release (both, right before processor is recycled)

               if (processor.getId()  == -2) {
                  getLog().info("Recycling upgrade processor in HTTP protocol.\n", new Exception());
               }

note that I adde method getId to all processors. For regular HTTP processors it equals to request ID (which I also added). For upgrade processor (org.apache.coyote.http11.upgrade.NioProcessor) ID always equal to -2.

and here is the output I got right before failed HTTP request:

Apr 08, 2016 CDT 08:26:09.686 AM http-nio-135.60.87.90-443-ClientPoller-1 org.apache.coyote.http11.Http11NioProtocol
INFO: Recycling upgrade processor in HTTP protocol.


java.lang.Exception
	at org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.release(Http11NioProtocol.java:200)
	at org.apache.tomcat.util.net.NioEndpoint$Poller.cancelledKey(NioEndpoint.java:953)
	at org.apache.tomcat.util.net.NioEndpoint$Poller.processKey(NioEndpoint.java:1135)
	at org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:1071)
	at java.lang.Thread.run(Thread.java:745)

....

Apr 08, 2016 CDT 08:26:09.710 AM http-nio-135.60.87.90-443-exec-106 org.apache.coyote.http11.Http11NioProtocol
INFO: AbstractConnectionHandler.process got processor from recycledProcessors map. status = OPEN_READ, processor.id = -2

So, what happens is - Poller adds NioProcessor to recycledProcessors stack of Http11NioProtocol.Http11ConnectionHandler and next HTTP request to Tomcat which is coming in is failing because Tomcat attempts to process it with NioProcessor instead of regular Http11NioProcessor.

If any additional logs are required to reproduce the issue  - please let me know. It seems like during normal operation such NioProcessors are just removed from commections map, for example here in AbstractProtocol.AbstractConnectionHandler.process method:
                    // Connection closed. OK to recycle the processor. Upgrade
                    // processors are not recycled.
                    connections.remove(socket);
                    if (processor.isUpgrade()) {
...

and are never expected to be recycled. And in case CancelledKeyException is thrown in Poller there's no check for upgrade processors and upgrade processor can be recycled.
Comment 1 Remy Maucherat 2016-04-08 15:59:02 UTC
This need to be verified, but I do notice some upgraded processors going into this code path when running the websockets portion of the testsuite. I added an extra check based on the upgraded flag.
Comment 2 Alexander Malyshev 2016-04-08 18:18:29 UTC
Does that mean that you have a fix for this issue?
Looking at the code it's not clean if it's enough to just skip recycling such Upgrade connectors or it's also needed to execute the code similar to the one executed in normal scenario:

                        UpgradeToken upgradeToken = processor.getUpgradeToken();
                        HttpUpgradeHandler httpUpgradeHandler = upgradeToken.getHttpUpgradeHandler();
                        InstanceManager instanceManager = upgradeToken.getInstanceManager();
                        if (instanceManager == null) {
                            httpUpgradeHandler.destroy();
                        } else {
                            ClassLoader oldCL = upgradeToken.getContextBind().bind(false, null);
                            try {
                                httpUpgradeHandler.destroy();
                                instanceManager.destroyInstance(httpUpgradeHandler);
                            } finally {
                                upgradeToken.getContextBind().unbind(false, oldCL);
                            }
                        }
Comment 3 Alexander Malyshev 2016-04-08 18:19:13 UTC
If you have a patch I can apply it, build tomcat and test the fix.
Comment 4 mgrigorov 2016-04-08 19:17:34 UTC
Please check http://svn.apache.org/viewvc?view=revision&revision=1738261
Comment 5 Alexander Malyshev 2016-04-11 12:33:08 UTC
The change you mentioned is for trunk. I'm using 8.0.33, and it's quite different. For example it does not have implemented release() method in AbstractProtocol class. I applied similar change to Http11NioProtocol class and will retest. But if you can provide actual diff for 8.0 (which will be eventually released) - it would be much better as I'll be able to make sure this change is actually fixing an issue.
Comment 6 Remy Maucherat 2016-04-11 16:13:42 UTC
Even 7 had the issue when adding a debug in the right release method and using the websocket portion of the testsuite. Oops.
Comment 7 Mark Thomas 2016-04-11 20:53:15 UTC
Remy's patches should have fixed this.
Comment 8 Remy Maucherat 2016-04-11 23:48:48 UTC
The fix will be in 9M5, 8.5.1, 8.0.34 and 7.0.70.
Comment 9 Alexander Malyshev 2016-04-14 15:26:40 UTC
Applied the patch and tested - issue is fixed.