Bug 62332 - Possible memory leak in org.apache.coyote.AbstractProtocol$ConnectionHandler when there's a protocol error
Summary: Possible memory leak in org.apache.coyote.AbstractProtocol$ConnectionHandler ...
Status: RESOLVED FIXED
Alias: None
Product: Tomcat 8
Classification: Unclassified
Component: WebSocket (show other bugs)
Version: 8.5.x-trunk
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: ----
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-04-26 16:52 UTC by Leonardo
Modified: 2018-04-27 12:23 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Leonardo 2018-04-26 16:52:33 UTC
We have an application that, after 12-24 hours, we need to restart it because of heavy memory usage. 

We detected, using JProfiler, that org.apache.coyote.AbstractProtocol$ConnectionHandler is the responsible for the leak, because it's storing several closed socket connections that should've been removed, but for some reason they weren't. 

Our application has heavy usage of Websockets and we have some clients that can open over 20,000 connections/day because of bad 3G/4G reception.

Upon debugging and looking at debug log levels in production, we noticed that we have a high load of exceptions in WsFrameBase.processInitialHeader that throws WsIoException with CloseReason.PROTOCOL_ERROR (probably due to packet loss given the bad connection). This exception is caught in WsHttpUpgradeHandler.upgradeDispatch() and closing the channel, but not returning a SocketState.CLOSED in this specific case (this issue is very similar to https://bz.apache.org/bugzilla/show_bug.cgi?id=62024 - but for a different CloseReason).

Unfortunately, we couldn't reproduce the bug locally using a script, so we forced WsFrameBase.processInitialHeader() to throw the WsIoException passing a PROTOCOL_ERROR(1002) as CloseReason.

We added a condition in the exception treatment in WsHttpUpgradeHandler 

"if (cr.getCloseCode() == CloseCodes.CLOSED_ABNORMALLY || cr.getCloseCode() == CloseCodes.PROTOCOL_ERROR) {
   return SocketState.CLOSED;
}

and used this modification in production for our application. After 24 hours, we no longer observed a leak and org.apache.coyote.AbstractProtocol$ConnectionHandler has a "healthy" amount of cached websocket connections.

I hope this helps and I'm available to answer any questions.

Best Regards,

Leonardo.
Comment 1 Mark Thomas 2018-04-27 12:23:52 UTC
Thanks for the report.

Having reviewed the code, I opted to always close the socket after an I/O error. In most (all?) cases it will not be possible to close the connection cleanly and this ensures that the connection is always closed.

Fixed in:
- trunk for 9.0.8 onwards
- 8.5.x for 8.5.31 onwards