Bug 64947

Summary: NPE in UpgradeProcessorExternal constructor
Product: Tomcat 9 Reporter: Peter Major <majorpetya>
Component: ConnectorsAssignee: Tomcat Developers Mailing List <dev>
Status: RESOLVED FIXED    
Severity: normal    
Priority: P2    
Version: 9.0.40   
Target Milestone: -----   
Hardware: PC   
OS: Mac OS X 10.4   

Description Peter Major 2020-12-01 15:19:48 UTC
When implementing a custom HttpUpgradeHandler implementation, the connection upgrade can fail with an NPE and the following stacktrace:

        java.lang.NullPointerException
                at org.apache.coyote.http11.upgrade.UpgradeProcessorExternal.<init>(UpgradeProcessorExternal.java:46)
                at org.apache.coyote.http11.AbstractHttp11Protocol.createUpgradeProcessor(AbstractHttp11Protocol.java:1102)
                at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:912)
                at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1601)
                at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
                at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
                at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
                at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
                at java.base/java.lang.Thread.run(Thread.java:834)

The custom http upgrade handler attempts to customize websocket connection upgrade, but this upgrade processing is terminated by the NPE caused by the null upgradeGroupInfo object in UpgradeProcessorExternal constructor.
Comment 1 Remy Maucherat 2020-12-01 15:45:40 UTC
Ok, this is new code that allows collecting stats.

I cannot immediately see why the UpgradeGroupInfo ends up null so more research could be useful [example code maybe ?], but there is a null check for that in UpgradeProcessorInternal (the one used most of the time), so adding it in UpgradeProcessorExternal would likely be a good plan.
Comment 2 Peter Major 2020-12-01 15:53:00 UTC
Sadly I don't have access to the source code (found it when setting up a third party reverse proxy product). The reason why upgradeGroupInfo is null, is because AbstractHttp11Protocol#getUpgradeGroupInfo is called with a null upgradeProtocol.
The reason why upgradeProtocol is null is that Request#upgrade populates it with response.getHeader("upgrade"), and that is being null. Was that meant to be request.getHeader instead?
Comment 3 Remy Maucherat 2020-12-01 16:08:05 UTC
(In reply to Peter Major from comment #2)
> Sadly I don't have access to the source code (found it when setting up a
> third party reverse proxy product). The reason why upgradeGroupInfo is null,
> is because AbstractHttp11Protocol#getUpgradeGroupInfo is called with a null
> upgradeProtocol.
> The reason why upgradeProtocol is null is that Request#upgrade populates it
> with response.getHeader("upgrade"), and that is being null. Was that meant
> to be request.getHeader instead?

Ok, so that makes sense, there is supposed to be a header "upgrade" in the response, it's mandatory in the 101 response [I suppose the client never checks it and just checks the 101 status]. But nothing checks for that missing header in Tomcat, and something has to be improved to avoid the NPE in that case.
Comment 4 Mark Thomas 2020-12-01 17:11:06 UTC
To re-phrase what Rémy said, we can fix the NPE but this is still going to fail because the response header is missing. You'll just get a nicer error message.
Comment 5 Peter Major 2020-12-01 17:27:04 UTC
The Upgrade header does eventually show up in the response, but it looks like it happens after the call to UpgradeProcessorExternal is made (added by an async servlet). From the browser's point of view I'm getting back a 101 response with the Upgrade header, but the connection is killed pretty much immediately after that.
Comment 6 Mark Thomas 2020-12-01 17:33:51 UTC
Hmm. Tomcat's Javadocs for HttpServletRequest.upgrade() have language stating the headers must be set before the method is called. The Servlet spec API does not have that language. I wonder where that language originated? I'll dig into it.

As I reviewed the code, it occurred to me we could probably create an "UNKNOWN" upgrade protocol to use for the stats. Might tweak that slightly so it can't possibly clash with a valid upgrade header.
Comment 7 Mark Thomas 2020-12-01 17:48:08 UTC
I can't fund any EG discussion to support a requirement that the upgrade header is set on the response before calling HttpServletRequest.upgrade(). I'll see if can find a different way to get the protocol name.
Comment 8 Remy Maucherat 2020-12-01 20:05:22 UTC
(In reply to Mark Thomas from comment #7)
> I can't fund any EG discussion to support a requirement that the upgrade
> header is set on the response before calling HttpServletRequest.upgrade().
> I'll see if can find a different way to get the protocol name.

The javadoc in Tomcat for HttpServletRequest.upgrade is wrong, most likely things changed in the spec after it was added.

In section 2.3.3.5, the spec says:
"When an upgrade request is received, the servlet can invoke the
HttpServletRequest.upgrade method, which starts the upgrade process. This
method instantiates the given HttpUpgradeHandler class. The returned
HttpUpgradeHandler instance may be further customized. The application prepares
and sends an appropriate response to the client. After exiting the service method
of the servlet, the servlet container completes the processing of all filters and marks
the connection to be handled by the HttpUpgradeHandler . It then calls the
HttpUpgradeHandler 's init method, passing a WebConnection to allow the protocol
handler access to the data streams."

So the actual upgrade happens when the Servlet is done, and nothing is fully set until then, no flush or commit happens when upgrade is called.
Overall, the behavior of Tomcat is ok except for that use of the response header.
Comment 9 Mark Thomas 2020-12-01 20:42:22 UTC
I have a fix for this locally. Just need to test it a little more.
Comment 10 Peter Major 2020-12-01 21:16:38 UTC
FYI if you have a PR/branch for this, I can test the fix locally.
Comment 11 Mark Thomas 2020-12-01 22:38:56 UTC
It is now in 9.0.x. Let us know how you get on.
Comment 12 Peter Major 2020-12-01 22:48:05 UTC
Things are looking good so far, haven't seen my websocket requests interrupted and catalina.out didn't show the NPE either.

Thank you for the quick turnaround.
Comment 13 Mark Thomas 2020-12-02 14:22:14 UTC
Fixed in:
- 10.0.x for 10.0.0-M11 onwards
- 9.0.x for 9.0.41 onwards
- 8.5.x for 8.5.61 onwards