Bug 61740 - Intermittent NIO HTTP/2 errors
Summary: Intermittent NIO HTTP/2 errors
Alias: None
Product: Tomcat 9
Classification: Unclassified
Component: Connectors (show other bugs)
Version: 9.0.1
Hardware: PC All
: P2 normal (vote)
Target Milestone: -----
Assignee: Tomcat Developers Mailing List
Depends on:
Reported: 2017-11-08 23:13 UTC by David Frankson
Modified: 2017-11-16 13:39 UTC (History)
0 users

screenshot of bug, code to reproduce (hopefully) (259.24 KB, application/x-zip-compressed)
2017-11-08 23:13 UTC, David Frankson

Note You need to log in before you can comment on or make changes to this bug.
Description David Frankson 2017-11-08 23:13:55 UTC
Created attachment 35509 [details]
screenshot of bug, code to reproduce (hopefully)

Iā€™m trying to troubleshoot an intermittent response error using Tomcat 8.5.23 or 9.0.1 when using HTTP/2.  Basically we noticed that when running in HTTP/2 random css, js or html pages would error out causing small bugs that would go away on refresh.  We were finally able to isolate it to a test case that "usually" reproduces the error. 

The test case uses 100 iframes to draw 10 table cells that each get colored green by a seperate css file so in total it makes 1101 requests.  If some of those requests don't make it and then it displays red on that cell.  (See attached image in the zip of the error in action).  I reproduced it using the latest version of Firefox with caching disabled so that it makes every request independently.  It is very hard to reproduce in Chrome since it tends to ignore no-caching settings.  I've also found it easier to reproduce using a client that has Windows 10 and a powerful computer.  A less powerful client running Windows 7 had difficulty reproducing the error but still could after enough tries.

So Tomcat running on Windows x64 and fresh download of either 9.0.1 or 8.5.23 with stock configuration I enable HTTP/2 with:

    <Connector port="8443" protocol="org.apache.coyote.http11.Http11NioProtocol"
               maxThreads="150" SSLEnabled="true" >
        <UpgradeProtocol className="org.apache.coyote.http2.Http2Protocol" />
                type="RSA" />

And put the test files in the ROOT app and then hit https://localhost:8443/newtest.html until the error happens.

As you can see in the image, some of the responses have 0 bytes and they will display in red, some of the responses have response bodies but no HTTP status code, some have HTTP 200 but no response body.  When there is no http status returned the access log records these as 500 errors.  I can't find any meaningful exception with catalina debug logging turned on.
Comment 1 Remy Maucherat 2017-11-09 08:29:45 UTC
According to my testing, this demonstrates some amount of reliability issues and possible fixes pretty much everywhere, but more with NIO2. Lots of things to work on and debug, thanks for keeping us busy ;)
Comment 2 Remy Maucherat 2017-11-09 14:51:47 UTC
I fixed the NIO2 specific issue (it will be in 9.0.2) pending possible further improvements. The behavior is now the same as with NIO, I can reproduce that very few of the static requests fail and I don't see where the root cause can be at the moment.
Comment 3 Mark Thomas 2017-11-16 11:45:22 UTC
A huge thank you for the test case. This bug has all the hallmarks of being very tricky to track down the root cause. Having a reliable test case is an enormous help.

I'm able to reproduce the problem and, with debug logging for HTTP/2 enabled, I can see an exception relating to HPACK decoding. I'm looking into this now.
Comment 4 Mark Thomas 2017-11-16 12:25:10 UTC
Fixed in:
- trunk for 9.0.2 onwards
- 8.5.x for 8.5.24 onwards

Again, many, many thanks for the test case.
Comment 5 David Frankson 2017-11-16 13:39:50 UTC
Thanks for the speedy fixes!