Bug 64671 - HTTP/2 Stream.receivedData method throwing continuous NullPointerException in the logs
Summary: HTTP/2 Stream.receivedData method throwing continuous NullPointerException in...
Status: RESOLVED FIXED
Alias: None
Product: Tomcat 9
Classification: Unclassified
Component: Catalina (show other bugs)
Version: 9.0.37
Hardware: HP Linux
: P2 regression (vote)
Target Milestone: -----
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-08-17 13:36 UTC by Raghavendran V
Modified: 2020-09-05 15:55 UTC (History)
1 user (show)



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Raghavendran V 2020-08-17 13:36:20 UTC
*Sub-Component - Coyote*

*OS : Redhat Linux*

*Description:*

*Setup:*

We have an implementation with Tomcat to transport http/2 packets between 2 systems (h2c connection). 

*Issue reproduction:*

With JMeter as simulation client, if we configure 200 threads (200 connections) to connect towards Tomcat Server 9.0.37 (embedded in our system) and if we send 200 requests per second (each packet app. having 8 to 10 KB), we are getting the following exceptions in Tomcat logs. 

    Exception in thread "http-nio-x.y.z.a-1234-exec-285" java.lang.NullPointerException
        at org.apache.coyote.http2.Stream.receivedData(Stream.java:618)
        at org.apache.coyote.http2.Http2UpgradeHandler.startRequestBodyFrame(Http2UpgradeHandler.java:1399)
        at org.apache.coyote.http2.Http2AsyncUpgradeHandler.startRequestBodyFrame(Http2AsyncUpgradeHandler.java:39)
        at org.apache.coyote.http2.Http2Parser.readDataFrame(Http2Parser.java:172)
        at org.apache.coyote.http2.Http2AsyncParser$FrameCompletionHandler.completed(Http2AsyncParser.java:245)
        at org.apache.coyote.http2.Http2AsyncParser$FrameCompletionHandler.completed(Http2AsyncParser.java:163)
        at org.apache.tomcat.util.net.SocketWrapperBase$VectoredIOCompletionHandler.completed(SocketWrapperBase.java:1087)
        at org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper$NioOperationState.run(NioEndpoint.java:1510)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
        at java.lang.Thread.run(Thread.java:748)


*Background*

Currently we are using 9.0.22 in our production environment. While pumping traffic with the above mentioned setup (with 9.0.22 tomcat version), we were seeing memory leak with Coyote.RequestInfo Objects. 

When analyzing the heap, we understood that the connection id number that was printed in the logs (Connection [22139]....) was almost matching the number of HTTP11Processor objects held in the heap, even though the number of live threads were less and the number of live connections too were less. 

So, suspecting a memory leak, we tried to simulate the same thing in the latest 9.x series tomcat version (9.0.37), in which we observed this exception (NullPointerException) and we were not seeing the memory leak issue. 

*Resolution/Clarification requested*

* Can you explain the root cause for this exception and is there an impact due to this NullPointerException? 
* Is there a known memory leak that is existing in 9.0.22 and is addressed in the upcoming versions, which is why we are not seeing the memory leak issue in 9.0.37? Can you please confirm? 


Thanks in advance.
Comment 1 Mark Thomas 2020-08-17 19:05:56 UTC
The NPE was detected in our CI system and fixed ~5 days ago. See https://github.com/apache/tomcat/commit/583b7ad17ff7cd67ac15a50e76cf2e0c5cd9138a

The memory leak looks like CVE-2020-13934. See https://tomcat.apache.org/security-9.html#Fixed_in_Apache_Tomcat_9.0.37
and "Ensure that the HTTP/1.1 processor is correctly recycled when a direct connection to h2c is made. (markt)" in the change log.
Comment 2 Raghavendran V 2020-08-18 06:31:17 UTC
Thank you so much for the swift response Mark! This is of great help for us. We will check on how we can take this in.
Comment 3 Raghavendran V 2020-08-18 09:49:50 UTC
As the heap issue (CVE-2020-13934) is a bit critical, we are planning to take 9.0.37 for time being. For that, a few more inputs we need on the NPE bug fix.

1. If the NPE issue fix is not included in the production (i.e., if we go with 9.0.37), is it only the logs that will be flooded with NPE, or are you seeing any other critical side effects (say memory leaks, etc.) due to the NPE issue?
2. Will this NPE fix be included in the 9.0.38 build (Sept release)?
3. You've indicated that 9.0.38 will be available in early September.  Is there a tentative date?

Thanks a ton for the support, in advance.
Comment 4 Mark Thomas 2020-08-18 20:25:38 UTC
1. It will trigger the closure of the entire connection which is probably not what you want.
2. Yes, it is already in the 9.0.x source code.
3. Sorry, no. It varies based in a number of factors. It typically happens in the second week of the month although since there hasn't been an August release I'm going to try and release in the first week of September.
Comment 5 Raghavendran V 2020-08-19 06:32:53 UTC
Once again, Thank you so much for the swift response Mark!
Comment 6 Raghavendran V 2020-09-04 12:32:30 UTC
Is there any recent plan changes on the 9.0.38 release dates? As this fix is critical for us to roll out our next product release, can you please update on whether there is a delay foreseen in the 9.0.x Sept release as indicated against the first week mentioned earlier? 

I understand that there is no promise on the dates. However, If you can help me with the plans and any change in that, that will be of great help. 

Thanks in advance.
Comment 7 Mark Thomas 2020-09-05 15:55:00 UTC
*** Bug 64710 has been marked as a duplicate of this bug. ***