Bug 63913

Summary: java.lang.NullPointerException: Inflater has been closed
Product: Tomcat 9 Reporter: Boris Petrov <boris_petrov>
Component: WebSocketAssignee: Tomcat Developers Mailing List <dev>
Status: RESOLVED FIXED    
Severity: minor    
Priority: P2    
Version: 9.0.27   
Target Milestone: -----   
Hardware: PC   
OS: Linux   

Description Boris Petrov 2019-11-10 07:10:41 UTC
I've been getting these for a while now. I'm not 100% sure it is a bug but it does look like one. Here's the full stacktrace:

ERROR Error reading request, ignored
       java.lang.NullPointerException: Inflater has been closed
       	at java.base/java.util.zip.Inflater.ensureOpen(Inflater.java:714)
       	at java.base/java.util.zip.Inflater.inflate(Inflater.java:370)
       	at org.apache.tomcat.websocket.PerMessageDeflate.getMoreData(PerMessageDeflate.java:203)
       	at org.apache.tomcat.websocket.WsFrameBase.processDataText(WsFrameBase.java:407)
       	at org.apache.tomcat.websocket.WsFrameBase.processData(WsFrameBase.java:294)
       	at org.apache.tomcat.websocket.WsFrameBase.processInputBuffer(WsFrameBase.java:133)
       	at org.apache.tomcat.websocket.server.WsFrameServer.onDataAvailable(WsFrameServer.java:82)
       	at org.apache.tomcat.websocket.server.WsFrameServer.doOnDataAvailable(WsFrameServer.java:171)
       	at org.apache.tomcat.websocket.server.WsFrameServer.notifyDataAvailable(WsFrameServer.java:151)
       	at org.apache.tomcat.websocket.server.WsHttpUpgradeHandler.upgradeDispatch(WsHttpUpgradeHandler.java:148)
       	at org.apache.coyote.http11.upgrade.UpgradeProcessorInternal.dispatch(UpgradeProcessorInternal.java:54)
       	at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:53)
       	at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:861)
       	at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1579)
       	at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
       	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
       	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
       	at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
       	at java.base/java.lang.Thread.run(Thread.java:830)

It happens always on the same test. It's very strange, I have no clue what the problem could be. Ideas?
Comment 1 Mark Thomas 2019-11-10 11:47:08 UTC
Can you package that test into a standalone test case we can run to see if we can reproduce the issue?
Comment 2 Boris Petrov 2019-11-10 21:06:43 UTC
That's going to be quite difficult... it doesn't even happen every time...

The stacktrace doesn't make sense or it's not obvious what's going on? Do you think it is a bug or it's not sure? Because I guess having a NPE is not normal.
Comment 3 Mark Thomas 2019-11-11 14:48:19 UTC
Not sure at this point. Which JRE are you using (Vendor and exact version please)?
Comment 4 Mark Thomas 2019-11-11 15:32:09 UTC
The Servlet API does not expose the information required to fix this for Servlets that override getLastModified().

That could be overcome by reflection but then the code would have to assume that the Content-Type for a given Servlet was constant. That may be true in most cases but it isn't guaranteed to true and I don't like building solutions based on assumptions I know to be false.

The Default Servlet case can be fixed generically, and entirely in the ExpiresFilter but only by using Servlet 4.0 API. Therefore I have fixed this in 9.0.x using the Servlet 4.0 API and in 8.5.x using the Servlet 4 preview package. I don't see an easy way to fix this in 7.0.x.

Fixed in:
- master for 9.0.28 onwards
- 8.5.x for 8.5.48 onwards
Comment 5 Mark Thomas 2019-11-11 15:33:21 UTC
Sorry. Wrong bug. Re-opening.
Comment 6 Boris Petrov 2019-11-11 17:49:19 UTC
It happens on my local PC and on the integration server. These are the two Java versions:

Local:
%  java --version
openjdk 12.0.2 2019-07-16
OpenJDK Runtime Environment (build 12.0.2+10)
OpenJDK 64-Bit Server VM (build 12.0.2+10, mixed mode)

CI:
openjdk 13 2019-09-17
OpenJDK Runtime Environment (build 13+33-Ubuntu-1)
OpenJDK 64-Bit Server VM (build 13+33-Ubuntu-1, mixed mode)

That error happens only when we run our integration tests under Chrome. I guess because it is faster. Some timing issue somewhere...
Comment 7 Mark Thomas 2019-11-11 18:05:27 UTC
Can you provide a little more detail about what the test is doing?

Looking at the code I see the following sequence of events as a possible trigger:
- client is sending data
- server detects an error
- client is still sending data
- server does a (forceful - not waiting for the close handshake) shutdown
- PerMessageDefault transformer is shutdown
- client is still sending data (it hasn't received the close message yet)
- server receives data and tries to process it
- PerMessageDefault transformer throws this error

Might this match your test?

I want to look to see if I can prevent the server processing more data once the server initiates a forceful shutdown.
Comment 8 Boris Petrov 2019-11-11 19:53:23 UTC
This scenario does sound possible, yes.

The test that we have checks that whenever a user logs in one tab of the browser in our site, in another tab the session is also initiated. Same when logging out from one of the tabs - the other is also sent to the login screen.

After logging in, a CometD (websocket) connection is established and a bunch of traffic begins. So it's definitely possible that when one tab logs out the other is still sending and receiving messages. So the server would receive some messages when the session is destroyed. I guess that kind of matches what you wrote.
Comment 9 Mark Thomas 2019-11-11 20:06:31 UTC
Thanks. That is really helpful. Let me see what I can do to reproduce this.
Comment 10 Mark Thomas 2019-11-11 22:32:42 UTC
Does the test pass when this exception is logged?
Comment 11 Boris Petrov 2019-11-12 06:55:28 UTC
Well, we have some code that checks for exceptions in the BE and fails the test if there were any, but without it, the test would pass, yes.
Comment 12 Mark Thomas 2019-11-12 09:43:13 UTC
Thanks.

Neither the WebSocket code nor Tomcat's I/O code is expecting a NullPointerException here which is why it bubbles all the way up to the Processor which logs it and closes the connection.

I'm going to fix this by catching the NPE and wrapping it in an IOException which is a better match for what is going wrong and is an Exception that the code is designed to handle.

Fixed in:
- master for 9.0.28 onwards
- 8.5.x for 8.5.48 onwards
- 7.0.x for 7.0.98 onwards
Comment 13 Boris Petrov 2019-11-12 11:02:20 UTC
Thanks for the great support! If there are further troubles when I try 9.0.28, I'll let you know. Thank you again!