Summary: | HTTP2 : GOAWAY sent with Protocol Error and Frame Size Error | ||
---|---|---|---|
Product: | Tomcat 9 | Reporter: | Arshiya <arshiya.shariff> |
Component: | Catalina | Assignee: | Tomcat Developers Mailing List <dev> |
Status: | RESOLVED FIXED | ||
Severity: | regression | ||
Priority: | P2 | ||
Version: | 9.0.39 | ||
Target Milestone: | ----- | ||
Hardware: | HP | ||
OS: | Linux | ||
Attachments: |
FRAME_SIZE_ERROR PCAP
PROTOCOL ERROR JMX file to reproduce issue Sample application to reproduce the issue PCAP with Goaway |
Description
Arshiya
2020-10-20 06:05:00 UTC
Created attachment 37513 [details]
FRAME_SIZE_ERROR PCAP
Created attachment 37514 [details]
PROTOCOL ERROR
Created attachment 37515 [details]
JMX file to reproduce issue
Created attachment 37516 [details]
Sample application to reproduce the issue
Any update on this please ! Pinging for updates on BZs is not allowed. This is not the first time you are doing this. *** Bug 64829 has been marked as a duplicate of this bug. *** *** Bug 64828 has been marked as a duplicate of this bug. *** What hardware are you running this on? I've been running the test case locally for 15+ mins and I haven't seen a single error reported. Details are hardware used, memory allocated to Tomcat and JMeter, network between client and server would likely be helpful. If we can't recreate the issue, we can't debug it. Moving to NEEDINFO as more information about the provided test case is required to reproduce the issue. Mark , In production we see the issue reported in this bug and bug 64828 on Linux. I had reproduced the issue with the attached source code on my local windows machine (Windows 10 - 64 bit / 16 GB RAM). JMeter version :5.3 with Xmx:5g Embedded tomcat application :Default Xmx Thanks in advance !! Thanks. I can try testing on a similar spec machine (although it will be a VM). How long does the test case have to be running before you start to see errors? Thanks Mark . Within 15 to 20 minutes of running the case I was able to see the errors , Please let us know if you need more information . Success. This was trivial to recreate on Windows. Must be a timing thing. Fix on the way. Fixed in: - 10.0.x for 10.0.0-M10 onwards - 9.0.x for 9.0.40 onwards - 8.5.x for 8.5.60 onwards Thank you soo much Mark for the fix .. One clarification please, Were you able to reproduce and fix the payload related issue reported in bug 64828 as well ? Thanks in advance ! I didn't see any payload errors reported. The other errors occurred after a few seconds on Windows so I fixed those then ran a longer test (20 mins) where no errors where observed. I don't immediately see how the issues I fixed could led to the payload errors described. It would be informative if you were to test the latest 9.0.x code and report back. Hi Mark , I tried running the same test program with the latest tomcat 9.0.41 jars on windows machine . 1.)I am still able to see the incomplete payload related exceptions and GOAWAY (PROTOCOL_ERROR and FRAME_SIZE_ERROR) . The test (payload size 55KB) ran for about 8 minutes where few requests had timed-out (could see in JMeter GUI). On filtering the requests in the PCAP captured with the unique identifier , I was able to find the trace for 3 requests with the below reason for GOAWAY (GOAWAY.pcap attached) *) FRAME_SIZE_ERROR:The payload is [2128653] bytes long but the maximum frame size is [16384] *) PROTOCOL_ERROR :Connection [7004], Stream [1], The content length header value [56,465] does not agree with the size of the data received [56,466] *) PROTOCOL_ERROR :Connection [7092], Stream [1], The content length header value [56,466] does not agree with the size of the data received [56,466] Specs: Windows 10 Processor:Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz 1.90GHz RAM:16 GB System Type : 64 bit Please let me know in case of further inputs. Thanks in advance! Created attachment 37600 [details]
PCAP with Goaway
Looks like there is still an issue related to the HPACK decoder. I can see a debug log where the decoder reports a content-length that is not consistent with the value shown in the Wireshark trace. Still trying to figure out how this is happening. Found it. It wasn't in the HPACK decoder at all. I've applied a patch and I can no longer get the test case to fail. Could you retest with a new 9.0.x build? I am going to assume that the recent commit fixed this. If this is not the case please do re-open but I think we'd need a new test case in that case as I can no longer trigger any errors with this one. |