Bug 63191 - RemoteEndpoint.Async#sendText(String, SendHandler) never calls the callback
Summary: RemoteEndpoint.Async#sendText(String, SendHandler) never calls the callback
Status: RESOLVED FIXED
Alias: None
Product: Tomcat 9
Classification: Unclassified
Component: WebSocket (show other bugs)
Version: 9.0.16
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: -----
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-02-20 14:24 UTC by Boris Petrov
Modified: 2021-06-16 10:59 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Boris Petrov 2019-02-20 14:24:24 UTC
I've been trying to debug this for a very long time now. Here is the original issue in CometD:

https://github.com/cometd/cometd/issues/820

In this post:

https://github.com/cometd/cometd/issues/820#issuecomment-456693279

I got to the point that CometD calls "_wsSession.getAsyncRemote().sendText(...)" in "org.cometd.websocket.server.WebSocketEndPoint" on line 100 for CometD version 4.0.2.

I can reproduce this pretty consistently. Please tell me how to debug this - I can perhaps put some logging and build Tomcat and run with it and try to figure out what's going on.

We're being hit really hard by this and I'm very eager to help debug it.
Comment 1 Boris Petrov 2019-02-22 11:04:17 UTC
When it happens, there are two specific threads in a stack dump:

-----------

   java.lang.Thread.State: BLOCKED (on object monitor)
        at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:46)
        - waiting to lock <0x00000006ece003c8> (a org.apache.tomcat.util.net.AprEndpoint$AprSocketWrapper)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.2/ThreadPoolExecutor.java:1128)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.2/ThreadPoolExecutor.java:628)
        at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
        at java.lang.Thread.run(java.base@11.0.2/Thread.java:834)

-----------

And:

-----------

   java.lang.Thread.State: WAITING (parking)
        at jdk.internal.misc.Unsafe.park(java.base@11.0.2/Native Method)
        - parking to wait for  <0x00000006e9bdec68> (a java.util.concurrent.CompletableFuture$Signaller)
        at java.util.concurrent.locks.LockSupport.park(java.base@11.0.2/LockSupport.java:194)
        at java.util.concurrent.CompletableFuture$Signaller.block(java.base@11.0.2/CompletableFuture.java:1796)
        at java.util.concurrent.ForkJoinPool.managedBlock(java.base@11.0.2/ForkJoinPool.java:3128)
        at java.util.concurrent.CompletableFuture.waitingGet(java.base@11.0.2/CompletableFuture.java:1823)
        at java.util.concurrent.CompletableFuture.get(java.base@11.0.2/CompletableFuture.java:1998)
        at org.cometd.websocket.server.WebSocketEndPoint.onMessage(WebSocketEndPoint.java:64)
        at org.cometd.websocket.server.WebSocketEndPoint.onMessage(WebSocketEndPoint.java:37)
        at org.apache.tomcat.websocket.WsFrameBase.sendMessageText(WsFrameBase.java:395)
        at org.apache.tomcat.websocket.server.WsFrameServer.sendMessageText(WsFrameServer.java:119)
        at org.apache.tomcat.websocket.WsFrameBase.processDataText(WsFrameBase.java:495)
        at org.apache.tomcat.websocket.WsFrameBase.processData(WsFrameBase.java:294)
        at org.apache.tomcat.websocket.WsFrameBase.processInputBuffer(WsFrameBase.java:133)
        at org.apache.tomcat.websocket.server.WsFrameServer.onDataAvailable(WsFrameServer.java:82)
        at org.apache.tomcat.websocket.server.WsFrameServer.doOnDataAvailable(WsFrameServer.java:171)
        at org.apache.tomcat.websocket.server.WsFrameServer.notifyDataAvailable(WsFrameServer.java:151)
        at org.apache.tomcat.websocket.server.WsHttpUpgradeHandler.upgradeDispatch(WsHttpUpgradeHandler.java:148)
        at org.apache.coyote.http11.upgrade.UpgradeProcessorInternal.dispatch(UpgradeProcessorInternal.java:54)
        at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:53)
        at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:834)
        at org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.doRun(AprEndpoint.java:2336)
        at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
        - locked <0x00000006ece003c8> (a org.apache.tomcat.util.net.AprEndpoint$AprSocketWrapper)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.2/ThreadPoolExecutor.java:1128)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.2/ThreadPoolExecutor.java:628)
        at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
        at java.lang.Thread.run(java.base@11.0.2/Thread.java:834)

-----------

As you can see, the second thread has locked a monitor on which the first one is waiting. I'm not sure this is a problem, but is strange I guess and may be a deadlock that is the reason for the trouble I have. This happens both with the APR endpoint and with the Nio one (when I'm using our site over HTTPS or over HTTP).
Comment 2 Mark Thomas 2019-03-01 12:15:48 UTC
I've been looking at the full thread dump (https://github.com/cometd/cometd/files/2779329/threads.zip) which which the two threads mentioned above have been extracted.

Thread #206 is blocked by thread #203. That looks normal. Tomcat uses per socket locks to ensure two threads don't try to access the same socket at the same time.

So why is thread #203 blocked? It has made a call into org.cometd.websocket.server.WebSocketEndPoint.onMessage() that has not returned. It is waiting for a call to CompletableFuture.get() to return.

I've looked through the thread dump and I can't find:
- any reference to either the condition or the lock that thread #203 is waiting for;
- any code that looks like it might be doing some processing with a view to calling succeed() or fail() on that Promise.

The Promise.Completable is used here:
https://github.com/cometd/cometd/blob/4.0.x/cometd-java/cometd-java-websocket/cometd-java-websocket-javax-server/src/main/java/org/cometd/websocket/server/WebSocketEndPoint.java#L62

I don't see anywhere where that call to onMessage could result in either succeed() or fail() not being called on the Promise but that is what the stack trace suggests is happening and I am not at all familiar with the CometD code so I might be missing something.


I also note that the original description makes reference to calling wsSession.getAsyncRemote().sendText(...) and that not returning but I don't see any evidence of that call in the stack trace either.

At the moment, I'm not sure where the root cause lies but I am not seeing any evidence that suggests that this is a Tomcat bug.
Comment 3 Boris Petrov 2019-03-01 20:16:47 UTC
Thank you for the detailed answer.

I'll start with the end of your message.

In the GitHub issue I answered a couple of days ago with the following message:

I downgraded to CometD 3.1.8 and the problem goes away. CometD 4 introduced async methods for authorization and some other things which I guess use the async IO implementation of Tomcat. What I think happens is that Tomcat 9 (all versions) have a deadlock in their code. CometD 3 uses synchronous IO which works correctly in Tomcat and that's why it works fine. Does that make sense?

The same question I ask to you - does what I wrote there make sense? Is it possible that something like that happens? Is that consistent with everything you've read from my posts? Is it consistent with these stacktraces that you see?

wsSession.getAsyncRemote().sendText(...) - you will not see this in the stacktraces because that's an asynchronous call - the original thread called this method and then continued on. What happens inside it is probably in another thread which could be one of the two that are blocked, not sure. The evidence about this is the println's that I've put and pasted in the GitHub issue.

I'm also not familiar with CometD's code. You can see in the repo that I created that I put some logging here and there and I've pasted the output. I also am not sure what calls what and what goes where, I'm just sure that wsSession.getAsyncRemote().sendText(...) is called and it doesn't call its callback.

I'm not sure what more information I can give you. If I'm right about this, I think this is a pretty major bug in Tomcat. What I can suggest is, because I reproduce this very consistently, I can try debugging it for you. Tell me where to put println statements in Tomcat and I'll paste you the output when it breaks. In a few weeks' time I guess we'll get to the problem.
Comment 4 Mark Thomas 2019-03-01 20:39:02 UTC
So far, you haven't provided any evidence to back up a claim of a Tomcat bug.

That the changes between CometD 3 and CometD 4 trigger the issue look to be valid based on the evidence provided. The claim that there is an I/O bug in Tomcat doesn't make any sense at this point as there is no evidence supporting that claim.

The stack-traces are not consistent with your claims.

The longer stack trace you quote is from Tomcat reading an incoming WebSocket message. It is not triggered by a message write. I'm not seeing any link between a call to wsSession.getAsyncRemote().sendText(...) and that longer stack trace.

The longer stack trace is blocked in CometD code waiting for a CometD Promise to complete. I can't find any trace of code that might trigger that completion in the full stack trace (but I don't know CometD so I might have missed something).

My suggestion is to work with the CometD folks to figure out why that Promise isn't completing.

I'm not ruling out a Tomcat bug at this point. The explanation of why that Promise isn't completing may point to a Tomcat issue but, if it does, we need that explanation to know where to look at what to look for.
Comment 5 Boris Petrov 2019-03-01 21:57:01 UTC
I'm not sure I understand.

If you do believe my claim that a call to javax.websocket.RemoteEndpoint.Async.sendText never calls the callback, is it possible the bug to be anywhere else other than Tomcat? I mean, this is Tomcat code and I don't think it calls back to CometD or anything else before calling the callback passed, right?

If you don't believe me, how can I prove it to you? In the original post here I gave a link to my GitHub comment where I give the output from my println's where I explicitly note how "sendText" is called and how it never calls its callback. You can see the println's I've put here:

https://github.com/boris-petrov/cometd-test/blob/eaef0491161dc36d57585599c3dce449b65aee42/cometd-java/cometd-java-websocket/cometd-java-websocket-javax-server/src/main/java/org/cometd/websocket/server/WebSocketEndPoint.java#L101

And the output here:

https://github.com/cometd/cometd/issues/820#issuecomment-456693279

On Monday I will post the stacktrace of when exactly "_wsSession.getAsyncRemote().sendText" is called. I'm not sure it will help but I will give it.

What else can I do? If this is a deadlock as I'm suggesting, it's not that I can give you a simple reproduction.

Also, you keep talking about the long stacktrace. It's the *effect* of the bug, not the cause - it has blocked on a Promise that CometD is waiting on which has not been resolved because down the line "sendText" hasn't called its callback in "org/cometd/websocket/server/WebSocketEndPoint.java#L101".

Again, please tell me what to do to make you begin to think that there might be an issue in Tomcat or disprove me. I'm fine with both scenarios. I just want the bug fixed.
Comment 6 Mark Thomas 2019-03-01 23:37:27 UTC
There are some error conditions that (as per the spec) mean that the javax.websocket.RemoteEndpoint.Async.sendText callback is never called. So yes, it is perfectly possible for the bug to be elsewhere. Attempts to send async messages concurrently is one of the error conditions that will trigger that.

It bears repeating that the Tomcat committers don't know the CometD code. If the blocked CometD promise leads to the Async.sendText() call that isn't clear from the information provided to date. A stack trace that shows exactly how that happens would be helpful.

It appears that you are using the NIO Connector. Can you confirm that?

It appears that you are using Tomcat 9.0.14. Can you test with 9.0.16? There were some improvements in I/O error handling that might impact this issue. Unlikely, but worth ruling out.

Does message size have any impact on how easy this is to reproduce? I have a possible theory. If it is correct, larger messages should make the problem easier to reproduce.
Comment 7 Boris Petrov 2019-03-02 07:02:10 UTC
In the GitHub thread I also explained that I tried putting try/catch around the sendText call exactly because sometimes the callback is not called but rather an exception is thrown:

https://github.com/cometd/cometd/issues/820#issuecomment-457784822

And it's not the case. I even opened a bug in Tomcat about that:

https://bz.apache.org/bugzilla/show_bug.cgi?id=63114

Where you explained that this was according to the spec.

I'll provide the stacktrace on Monday. I would think that the "large" stacktrace/thread is the one that calls "sendText" but let's see.

In the second post in this issue here I wrote about the connectors:

This happens both with the APR endpoint and with the Nio one (when I'm using our site over HTTPS or over HTTP).

So I guess the problem is outside the connectors themselves.

About the version - I explicitly opened this issue here noting that the bug is for Tomcat 9.0.16. That was the first thing I tried when it came out - but it doesn't fix the issue unfortunately.

The error happens 99% of the time after a page refresh - some messages are streamed over the websocket but then I think when CometD sends heartbeats - that's when it fails. So I don't think these are big messages, but it probably can happen with such also. Not sure about that.
Comment 8 Boris Petrov 2019-03-04 14:59:17 UTC
java.lang.Exception
        at org.cometd.websocket.server.WebSocketEndPoint$Delegate.send(WebSocketEndPoint.java:102)
        at org.cometd.websocket.server.common.AbstractWebSocketEndPoint$Flusher.process(AbstractWebSocketEndPoint.java:500)
        at org.eclipse.jetty.util.IteratingCallback.processing(IteratingCallback.java:241)
        at org.eclipse.jetty.util.IteratingCallback.succeeded(IteratingCallback.java:365)
        at org.cometd.websocket.server.WebSocketEndPoint$Delegate.lambda$send$0(WebSocketEndPoint.java:113)
        at org.apache.tomcat.websocket.WsRemoteEndpointImplBase$TextMessageSendHandler.onResult(WsRemoteEndpointImplBase.java:811)
        at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.endMessage(WsRemoteEndpointImplBase.java:417)
        at org.apache.tomcat.websocket.WsRemoteEndpointImplBase$EndMessageHandler.onResult(WsRemoteEndpointImplBase.java:532)
        at org.apache.tomcat.websocket.server.WsRemoteEndpointImplServer$OnResultRunnable.run(WsRemoteEndpointImplServer.java:320)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
        at java.base/java.lang.Thread.run(Thread.java:834)


This is the stacktrace of the thread that calls "sendText". Unfortunately it's not very useful I think. Or is it? It is exactly the same as many before it which have successfully called the callback. Any ideas?
Comment 9 Mark Thomas 2019-03-05 20:23:03 UTC
It is not clear that Tomcat not calling the callback for Async.sendText() is the cause of CometD failing to complete a Promise rather than the other way around.

With Callbacks triggering the sending of new messages it is possible some sort of deadlock is entered but it isn't clear what the sequence of events leading to this issue is at this point. Until we know that any attempt to fix this issue is going to be largely guess work.

Knowing the sort of thing that is happening, we can do a code inspection and try and find potential deadlocks but - if the issue is in Tomcat - finding it that way is going to involve a large amount of luck.
Comment 10 Boris Petrov 2019-03-06 07:42:30 UTC
(In reply to Mark Thomas from comment #9)
> It is not clear that Tomcat not calling the callback for Async.sendText() is
> the cause of CometD failing to complete a Promise rather than the other way
> around.

I'm not sure I understand how the other way around is possible. Is "Async.sendText" somehow calling back to user code (before the response callback I mean)? How else could it be possible that someone calling "Async.sendText" is responsible for it not calling the callback? Perhaps we can start digging from there?

> Until we know that any attempt to
> fix this issue is going to be largely guess work.
> 
> Knowing the sort of thing that is happening, we can do a code inspection and
> try and find potential deadlocks but - if the issue is in Tomcat - finding
> it that way is going to involve a large amount of luck.

OK, please suggest what we do now. I'm not sure how to proceed. How can I help more?

Do you have some sort of stress-tests that you can rewrite to use the async connectors/code? I think it should be fairly easy to reproduce the problem seeing how it happens all the time when I try it.
Comment 11 Mark Thomas 2019-03-06 10:25:21 UTC
Going back to the original thread dump with the "long" and the "short" stack traces...

It is possible that the short stack trace is Tomcat trying to send the rest of the message sent via Async.sendText() and it blocked by the longer stack trace which is, in turn, blocked because the CometD Promise has not completed.

Hence my question: what is that Promise waiting for? I don't know the CometD code well enough to trace it through to answer that question.

I can see possibilities for deadlock here but it could be a complete red herring.

What would be helpful at this point is an explanation of the execution path of CometD that shows what action the Promise is waiting to complete.
Comment 12 Boris Petrov 2019-11-03 16:00:03 UTC
Mark, I can't be completely certain but this issue seems to be resolved in 9.0.27 (probably in some of the previous ones too). Have there been any/many changes between the last broken version until today that could have resolved that? Do you have any ideas or we just call it done and if it reappears, we can talk about it anew?
Comment 13 Mark Thomas 2019-11-03 19:23:30 UTC
I read through the changelog since 9.0.14 for WebSocket and Coyote and there were a handful of things that could have impacted this. They might have fixed a bug, they might just have changed the timing sufficiently that the bug is less likely.

If you can no longer reproduce this then I am happy to close this issue. We can always create a new one if the issue re-surfaces.
Comment 14 Boris Petrov 2019-11-04 11:03:07 UTC
OK, thanks. I'll mark it as fixed for now and will shout loud if it reappears. Thank you for the great support!
Comment 15 Boris Petrov 2019-11-29 11:38:22 UTC
So, actually the bug is still here even on `9.0.29`, although maybe less often, yes. Actually it happens mostly (only?) with Safari on Mac OS X, not that it should matter. This is the stacktrace when running on `9.0.29` (there's a billion threads blocked like this):

---------

   java.lang.Thread.State: WAITING (parking)
        at jdk.internal.misc.Unsafe.park(java.base@13/Native Method)
        - parking to wait for  <0x000000068aacc300> (a java.util.concurrent.CompletableFuture$Signaller)
        at java.util.concurrent.locks.LockSupport.park(java.base@13/LockSupport.java:194)
        at java.util.concurrent.CompletableFuture$Signaller.block(java.base@13/CompletableFuture.java:1867)
        at java.util.concurrent.ForkJoinPool.managedBlock(java.base@13/ForkJoinPool.java:3137)
        at java.util.concurrent.CompletableFuture.waitingGet(java.base@13/CompletableFuture.java:1894)
        at java.util.concurrent.CompletableFuture.get(java.base@13/CompletableFuture.java:2069)
        at org.cometd.websocket.server.WebSocketEndPoint.onMessage(WebSocketEndPoint.java:64)
        at org.cometd.websocket.server.WebSocketEndPoint.onMessage(WebSocketEndPoint.java:37)
        at org.apache.tomcat.websocket.WsFrameBase.sendMessageText(WsFrameBase.java:395)
        at org.apache.tomcat.websocket.server.WsFrameServer.sendMessageText(WsFrameServer.java:119)
        at org.apache.tomcat.websocket.WsFrameBase.processDataText(WsFrameBase.java:495)
        at org.apache.tomcat.websocket.WsFrameBase.processData(WsFrameBase.java:294)
        at org.apache.tomcat.websocket.WsFrameBase.processInputBuffer(WsFrameBase.java:133)
        at org.apache.tomcat.websocket.server.WsFrameServer.onDataAvailable(WsFrameServer.java:82)
        at org.apache.tomcat.websocket.server.WsFrameServer.doOnDataAvailable(WsFrameServer.java:171)
        at org.apache.tomcat.websocket.server.WsFrameServer.notifyDataAvailable(WsFrameServer.java:151)
        at org.apache.tomcat.websocket.server.WsHttpUpgradeHandler.upgradeDispatch(WsHttpUpgradeHandler.java:148)
        at org.apache.coyote.http11.upgrade.UpgradeProcessorInternal.dispatch(UpgradeProcessorInternal.java:54)
        at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:59)
        at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:860)
        at org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.doRun(AprEndpoint.java:2022)
        at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
        - locked <0x000000068aacc440> (a org.apache.tomcat.util.net.AprEndpoint$AprSocketWrapper)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@13/ThreadPoolExecutor.java:1128)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@13/ThreadPoolExecutor.java:628)
        at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
        at java.lang.Thread.run(java.base@13/Thread.java:830)

Not sure how to proceed. If you would like, we can reopen this (so you don't forget it) and I'll be trying from time to time new versions to see what goes. Thanks!
Comment 16 Boris Petrov 2020-04-20 16:24:33 UTC
Just a heads up. The same still happens on Tomcat 9.0.34 and CometD 5 (on 4 too). Doesn't happen *at all* when using CometD 3 (which uses blocking IO).
Comment 17 Boris Petrov 2020-10-26 10:46:25 UTC
Still happening on Tomcat 9.0.39 and CometD 5.0.1. It seemed to work for a day or two (being accessed mostly from Chrome and Firefox) but then a connection from Safari broke it again. Bot sure how this could be relevant but just mentioning it. Here is the blocked thread's stack:

java.lang.Thread.State: WAITING (parking)
     at jdk.internal.misc.Unsafe.park(java.base@14.0.1/Native Method)
     - parking to wait for  <0x000000065a4c4ad8> (a java.util.concurrent.CompletableFuture$Signaller)
     at java.util.concurrent.locks.LockSupport.park(java.base@14.0.1/LockSupport.java:211)
     at java.util.concurrent.CompletableFuture$Signaller.block(java.base@14.0.1/CompletableFuture.java:1860)
     at java.util.concurrent.ForkJoinPool.managedBlock(java.base@14.0.1/ForkJoinPool.java:3137)
     at java.util.concurrent.CompletableFuture.waitingGet(java.base@14.0.1/CompletableFuture.java:1887)
     at java.util.concurrent.CompletableFuture.get(java.base@14.0.1/CompletableFuture.java:2062)
     at org.cometd.server.websocket.javax.WebSocketEndPoint.onMessage(WebSocketEndPoint.java:64)
     at org.cometd.server.websocket.javax.WebSocketEndPoint.onMessage(WebSocketEndPoint.java:37)
     at org.apache.tomcat.websocket.WsFrameBase.sendMessageText(WsFrameBase.java:402)
     at org.apache.tomcat.websocket.server.WsFrameServer.sendMessageText(WsFrameServer.java:119)
     at org.apache.tomcat.websocket.WsFrameBase.processDataText(WsFrameBase.java:502)
     at org.apache.tomcat.websocket.WsFrameBase.processData(WsFrameBase.java:301)
     at org.apache.tomcat.websocket.WsFrameBase.processInputBuffer(WsFrameBase.java:133)
     at org.apache.tomcat.websocket.server.WsFrameServer.onDataAvailable(WsFrameServer.java:82)
     at org.apache.tomcat.websocket.server.WsFrameServer.doOnDataAvailable(WsFrameServer.java:171)
     at org.apache.tomcat.websocket.server.WsFrameServer.notifyDataAvailable(WsFrameServer.java:151)
     at org.apache.tomcat.websocket.server.WsHttpUpgradeHandler.upgradeDispatch(WsHttpUpgradeHandler.java:148)
     at org.apache.coyote.http11.upgrade.UpgradeProcessorInternal.dispatch(UpgradeProcessorInternal.java:54)
     at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:59)
     at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:868)
     at org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.doRun(AprEndpoint.java:1991)
     at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
     - locked <0x000000065a4c4c18> (a org.apache.tomcat.util.net.AprEndpoint$AprSocketWrapper)
     at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@14.0.1/ThreadPoolExecutor.java:1130)
     at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@14.0.1/ThreadPoolExecutor.java:630)
     at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
     at java.lang.Thread.run(java.base@14.0.1/Thread.java:832)
Comment 18 Boris Petrov 2021-06-16 10:59:58 UTC
Still happening on Tomcat 9.0.48 and CometD 6.0.3. Here is a blocked thread's stack:

java.lang.Thread.State: WAITING (parking)
        at jdk.internal.misc.Unsafe.park(java.base@16.0.1/Native Method)
        - parking to wait for  <0x00000006168051e0> (a java.util.concurrent.CompletableFuture$Signaller)
        at java.util.concurrent.locks.LockSupport.park(java.base@16.0.1/LockSupport.java:211)
        at java.util.concurrent.CompletableFuture$Signaller.block(java.base@16.0.1/CompletableFuture.java:1860)
        at java.util.concurrent.ForkJoinPool.managedBlock(java.base@16.0.1/ForkJoinPool.java:3137)
        at java.util.concurrent.CompletableFuture.waitingGet(java.base@16.0.1/CompletableFuture.java:1894)
        at java.util.concurrent.CompletableFuture.get(java.base@16.0.1/CompletableFuture.java:2068)
        at org.cometd.server.websocket.javax.WebSocketEndPoint.onMessage(WebSocketEndPoint.java:62)
        at org.cometd.server.websocket.javax.WebSocketEndPoint.onMessage(WebSocketEndPoint.java:35)
        at org.apache.tomcat.websocket.WsFrameBase.sendMessageText(WsFrameBase.java:415)
        at org.apache.tomcat.websocket.server.WsFrameServer.sendMessageText(WsFrameServer.java:129)
        at org.apache.tomcat.websocket.WsFrameBase.processDataText(WsFrameBase.java:515)
        at org.apache.tomcat.websocket.WsFrameBase.processData(WsFrameBase.java:301)
        at org.apache.tomcat.websocket.WsFrameBase.processInputBuffer(WsFrameBase.java:133)
        at org.apache.tomcat.websocket.server.WsFrameServer.onDataAvailable(WsFrameServer.java:85)
        at org.apache.tomcat.websocket.server.WsFrameServer.doOnDataAvailable(WsFrameServer.java:183)
        at org.apache.tomcat.websocket.server.WsFrameServer.notifyDataAvailable(WsFrameServer.java:162)
        at org.apache.tomcat.websocket.server.WsHttpUpgradeHandler.upgradeDispatch(WsHttpUpgradeHandler.java:156)
        at org.apache.coyote.http11.upgrade.UpgradeProcessorInternal.dispatch(UpgradeProcessorInternal.java:60)
        at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:59)
        at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:893)
        at org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.doRun(AprEndpoint.java:2049)
        at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
        - locked <0x0000000616804fa0> (a org.apache.tomcat.util.net.AprEndpoint$AprSocketWrapper)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@16.0.1/ThreadPoolExecutor.java:1130)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@16.0.1/ThreadPoolExecutor.java:630)
        at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
        at java.lang.Thread.run(java.base@16.0.1/Thread.java:831)