Bug 62177 - Protocol errors when using HTTP2 SSL Servlet 4.0 ServerPush PushBuilder
Summary: Protocol errors when using HTTP2 SSL Servlet 4.0 ServerPush PushBuilder
Status: RESOLVED FIXED
Alias: None
Product: Tomcat 9
Classification: Unclassified
Component: Servlet (show other bugs)
Version: 9.0.6
Hardware: PC All
: P2 normal (vote)
Target Milestone: -----
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-03-13 15:45 UTC by Holger Sunke
Modified: 2018-03-15 15:23 UTC (History)
0 users



Attachments
server.xml, catalina.log, chrome detailed protocol event output (59.23 KB, application/x-zip-compressed)
2018-03-13 15:45 UTC, Holger Sunke
Details
Log when invoking /showcase-6.3-SNAPSHOT with FireFox (86.12 KB, text/plain)
2018-03-14 07:26 UTC, Holger Sunke
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Holger Sunke 2018-03-13 15:45:01 UTC
Created attachment 35760 [details]
server.xml, catalina.log, chrome detailed protocol event output

hello,

web pages using ServerPush feature with JSF Mojarra 2.3.2 and many to be pushed nested resources fail to load due to SSL protocol violations.

(nested resource = pushed .css file pointing to a background image or icon etc. which is to be pushed too).

Example application: PrimeFaces Showcase 6.3-SNAPSHOT:


git clone https://github.com/primefaces/showcase.git
cd showcase
mvn clean package -Pcommunity-stable

-> deploy the target\showcase-6.3-SNAPSHOT.war

Use attached server.xml with your customized keystore.
Use jdk9.
Use Chrome "chrome://net-internals/#events" URL to inspect protocol events.

Invoke the showcase:
http://localhost:8080/showcase-6.3-SNAPSHOT/ -> SUCCESS (HTTP1.1)
https://localhost:8443/showcase-6.3-SNAPSHOT/ -> ERR_SPDY_PROTOCOL_ERROR
https://localhost:8444/showcase-6.3-SNAPSHOT/ -> ERR_SPDY_PROTOCOL_ERROR
https://localhost:8445/showcase-6.3-SNAPSHOT/ -> ERR_SPDY_PROTOCOL_ERROR
https://localhost:8446/showcase-6.3-SNAPSHOT/ -> ERR_SPDY_PROTOCOL_ERROR
https://localhost:8447/showcase-6.3-SNAPSHOT/ -> SUCCESS (HTTP1.1)


I guess there's some race condition. When setting up a "smaller" reproducer with less (nested) resources, browsing that app randomly succeeds.

Kind regards.
Comment 1 Remy Maucherat 2018-03-13 16:42:49 UTC
Ok, so it is the followup to 62053 I guess. If I go there ( https://github.com/javaserverfaces/mojarra/issues/4329 ), use the test app with Firefox, it pushes fine and doesn't break. Is chrome and the showcase mandatory to reproduce ?
Comment 2 Holger Sunke 2018-03-14 07:26:24 UTC
Created attachment 35761 [details]
Log when invoking /showcase-6.3-SNAPSHOT with FireFox

When invoking /showcase-6.3-SNAPSHOT with FireFox, same issues occure. I have chosen Chrome because it has more protocol inspection capabilities.

BTW: please turn off the Cache usage in [F12] development tools.
Comment 3 Remy Maucherat 2018-03-14 15:42:45 UTC
Ok, tested. What I really don't buy at this point is that all three connectors (APR, NIO, NIO2) do the exact same thing if it is a "race condition" of some sort.
Comment 4 Holger Sunke 2018-03-14 19:06:25 UTC
I think itmight be connector independent.
Somwhere it must be decided if a new HTTP2-Stream is created with incremented ID.
When looking at the net-events.html output from Chrome, it seems like this decision goes wrong (sometimes?) for a nested resource where the referrer is a pushed resource itself.

a) showcase.css PUSH PROMISE received in stream ID 1, promised for future stream 2.
   referrer: selector.xhtml
b) perfect-scrollbar.css PUSH PROMISE received in stream ID 1, promised for future stream 4.
   referrer: selector.xhtml
c) lato-black-webfont.eot font file PUSH PROMISE received in stream ID 2, promised for future stream 6.
   referrer: (!)showcase.css(!)

It seemes very important that PUSH PROMISEs are sent in odd IDed stream in response to browser request stream with even ID.

In (c) Chrome detects that a PUSH PROMISE is incoming on Stream ID 2 which is an even ID while an odd one is expected. Chrome than runs around screaming "FIRE! FIRE!" and closes the window (aka connection) which triggers subsequent write attempts on closed streams server side with plenty exceptions logged (one exception per ressource pushing Thread).


So it also seemes that PUSH PROMISEs with subsequent Ressource push transmission are treated as "fake" inbound HTTP request in a separate http connector Thread.

Let this be the algorithm:
I) Incoming (fake or real) HTTP/2 request with stream ID x
II) Open new Stream on current SSL connection with ID=x+1 to transmit PUSH PROMISEs.

This is also what happens when showcase.css promised in (a) starts to get transmitted:
The Thread recognizes there are additional resources referenced in showcase.css to be pushed, and does (II) while current Stream ID is still x=1 (if the thread is fast enough and no other thread does incremets before it).
So the ID gets incremeted to 2 and subsequent PUSH PROMISEs get transmitted in illegal even stream id 2.

BTW: Firefox seems to be mor tolerant against this and does request retries so that the Page displays correctly at some point in time, still leaving several Exceptions in server log.
Chrome instantly enters panic mode and disconnects.

This is all guesses only without viewing the source code. I'm not sure what the correct fix is here.
Maybe we have to give step (II) awareness about if this is currently a "fake" request and/or that PUSH PROMISES must be sent in stream 1 or without ID incremet.

Are new Streams created in Connector classes or is it Servlet 4 specific code?
And a bit off topic: Does ServerPush also work with AJP when frontend HTTP server uses SSL over HTTP2?

Kind regards.
Comment 5 Mark Thomas 2018-03-15 11:34:13 UTC
Yes, there is a bug here.

I'm not going to go into the details of how PUSH_PROMISE works - see RFC7540 for that.

RFC 7540 requires that PUSH_PROMISE frames are only sent on peer initiated streams in the OPEN or HALF-CLOSED state. Tomcat is not following this requirement. I'll look at a patch.
Comment 6 Remy Maucherat 2018-03-15 13:49:54 UTC
After noticing JSSE with Java 9 didn't have the SSL session access ISE, I removed them for OpenSSL. This should cleanup the logs in some edge cases like this one here.
Comment 7 Mark Thomas 2018-03-15 14:11:53 UTC
Fixed in:
- trunk for 9.0.7 onwards
- 8.5.x for 8.5.30 onwards
Comment 8 Remy Maucherat 2018-03-15 15:07:36 UTC
NIO2 is still broken (unfortunately my more optimized sync in writeHeaders has to be dropped again if header frames actually have to be written in order - too bad, another better optimization is needed ...). I'll fix it the easy (slow) way for now.

Also on shutdown I have many threads blocking forever on reserveWindowSize with that showcase app test.
Comment 9 Mark Thomas 2018-03-15 15:13:24 UTC
Re-opening as Rémy has indicated NIO2 still has issues.
Comment 10 Remy Maucherat 2018-03-15 15:23:47 UTC
So fixed in trunk for 9.0.7.