Bug 67235 - NPE (NullPointerException) occurs in AsyncContextImpl.decrementInProgressAsyncCount after version 10.1.12.
Summary: NPE (NullPointerException) occurs in AsyncContextImpl.decrementInProgressAsyn...
Status: RESOLVED FIXED
Alias: None
Product: Tomcat 10
Classification: Unclassified
Component: Servlet (show other bugs)
Version: unspecified
Hardware: All All
: P2 normal (vote)
Target Milestone: ------
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-09-06 04:17 UTC by bright.k
Modified: 2023-09-15 07:43 UTC (History)
1 user (show)



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description bright.k 2023-09-06 04:17:11 UTC
The error log "java.lang.NullPointerException: Cannot invoke 'org.apache.catalina.Context.decrementInProgressAsyncCount()' because 'this.context' is null" occurs in AsyncContextImpl.decrementInProgressAsyncCount when using Tomcat version 10.1.12 and later. This issue occurs when using SSE (Server-Sent Events). Specifically, it always happens when the client terminates the connection first, and the server then tries to send data. The issue occurs in both versions 10.1.12 and 10.1.13, but it does not occur in version 10.1.11.

Below is the attached stack trace.


```
java.lang.NullPointerException: Cannot invoke "org.apache.catalina.Context.decrementInProgressAsyncCount()" because "this.context" is null
	at org.apache.catalina.core.AsyncContextImpl.decrementInProgressAsyncCount(AsyncContextImpl.java:441)
	at org.apache.coyote.AsyncStateMachine.asyncPostProcess(AsyncStateMachine.java:295)
	at org.apache.coyote.AbstractProcessor.asyncPostProcess(AbstractProcessor.java:197)
	at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:78)
	at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:894)
	at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1740)
	at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:52)
	at org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1191)
	at org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659)
	at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
	at java.base/java.lang.Thread.run(Thread.java:833)
```
Comment 1 Mark Thomas 2023-09-06 10:15:04 UTC
Are you using HTTP/2 for these requests? I'm guessing not but just wanted to check.
Comment 2 Mark Thomas 2023-09-06 10:49:35 UTC
I am unable to re-create the error described from the information provided.

Please provide the simplest possible test case the reproduces this error.
Comment 3 bright.k 2023-09-06 10:59:06 UTC
I did not use HTTP2.
This issue occurred when setting up SSE (Server-Sent Events), and it's expected that this could always happen when using an asynchronous servlet.

I'm building a server using the Spring Framework, and I'll briefly attach some sample code below that could cause this issue when running with embedded Tomcat as the web application server.

```
class EventStreamV2Controller {

    @GetMapping(value = ["/sse"], produces = [MediaType.TEXT_EVENT_STREAM_VALUE])
    fun connectStream(): SseEmitter {
        val sseEmitter = SseEmitter(Duration.ofSeconds(10).toMillis())

        Thread {
            Thread.sleep(Duration.ofSeconds(5).toMillis())
            sseEmitter.send("message")
        }.start()

        return sseEmitter
    }
}

```

When there is an API like the one described above, if the client preemptively terminates the SSE connection within 5 seconds after making the connection request, an issue occurs at the point when sseEmitter.send("message") is called.
Comment 4 Han Li 2023-09-07 06:44:39 UTC
It can be reproduced. From my debug observations, it turns out that it's ultimately due to the change of `org.apache.coyote.AsyncStateMachine#asyncError` (Maybe not final reason). 
Since I'm not very familiar with this area, I'll try to be as descriptive as possible :) 

Abnormal behavior:
10.1.11: execute `decreaseInProgressAsyncCount` first, then execute `recycle` 10.1.12: becomes `recycle` first, then execute `decreaseInProgressAsyncCount`.


1. When writing after 5s, an IO exception is encountered because the connection is broken. So you end up changing the state to ERROR at org.apache.coyote.AbstractProcessor#setErrorState:122 line, which is normal I think and which is first time to set state.
2.An scoket ERROR event is then generated after 1) and process it asynchronously.
The final execution reaches org.apache.catalina.core.AsyncContextImpl#setErrorState: 393 line to notify the listener that an exception exists. At this point, there is a listener `StandardServletAsyncWebRequest`. This listener will eventually execute to ` org.apache.coyote.AsyncStateMachine#asyncDispatch` where will set state to DISPATCHING. 
3.The process continues, and then it goes to org.apache.coyote.AsyncStateMachine#asyncError. This is the problem. 10.1.11 eventually set state to ERROR, but 10.1.12 there ensure the error processing is only started once per generation, the state is still DISPATCHING. 


Hope the analysis is right ;)
Comment 5 Mark Thomas 2023-09-08 17:12:11 UTC
Thanks for the reproducer and thanks Han for the analysis. The analysis looks good to me.

This bug has been present for a while and was exposed by the fix for bug 66841.

As my comments on bug 66941 suggested, there were a few things that needed to be cleaned up. One of which was multiple calls to error handlers. The fix for BZ 66841 started to do that but a side effect of that clean-up was to expose this bug. Previously it was masked by multiple loops through the async error handling process.

The fix for this is relatively straight forward although I am going to take the opportunity to have another look at the async error handling.
Comment 6 Mark Thomas 2023-09-13 10:55:06 UTC
I've applied the fix but I am leaving this issue open for now as I want to take a further a look at the error handling.
Comment 7 Mark Thomas 2023-09-15 07:43:27 UTC
The checks for duplicate error handling have been simplified and expanded. This addresses the behaviour I observed while investigating this bug.