50667 – Tribes | RpcChannel | Add a callback for cases when an error occured sending a reply to an RP call

Bug 50667 - Tribes | RpcChannel | Add a callback for cases when an error occured sending a reply to an RP call

Summary: Tribes | RpcChannel | Add a callback for cases when an error occured sending ...

Status:	RESOLVED FIXED

Alias:	None

Product:	Tomcat 7
Classification:	Unclassified
Component:	Cluster (show other bugs)
Version:	7.0.6
Hardware:	All All

Importance:	P2 enhancement (vote)
Target Milestone:	---
Assignee:	Tomcat Developers Mailing List

URL:
Keywords:

Depends on:
Blocks:

Reported:	2011-01-27 06:55 UTC by Olivier Costet
Modified:	2011-02-09 12:40 UTC (History)
CC List:	0 users

Attachments
svn diff (1.55 KB, patch) 2011-01-27 08:49 UTC, Olivier Costet	Details \| Diff
Patch to include replyFailed functionality (5.56 KB, patch) 2011-01-27 16:50 UTC, Filip Hanik	Details \| Diff
Different suggestion for replyFailed (9.06 KB, patch) 2011-01-31 04:48 UTC, Olivier Costet	Details \| Diff
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Olivier Costet 2011-01-27 06:55:29 UTC

The RpcChannel API user registers an RpcCallback to be notified of
communication events. However, there is no callback method for the case where
the sending of a response message fails. This entry proposes the addition of a callback method for such cases.

CAVEAT: this proposal modifies the RpcCallback interface. RpcCallback is currently implemented by at least four tomcat classes, EchoRpcTest in org.apache.catalina.tribes.demos, and AbstractReplicatedMap, ReplicatedMap ad LazyReplicatedMap in org.apache.catalina.tribes.tipis. The attached patch does not include the implementation of the additional interface method for those classes, as this coder did not feel qualified to tell how it should be implemented. They could probably simply be NOOPs, however.

Comment 1 Olivier Costet 2011-01-27 06:57:27 UTC

Reproducing comments from https://issues.apache.org/bugzilla/show_bug.cgi?id=50648:

Filip Hanik wrote:
> 2. 
> What type of exceptions have you seen here?
> Reply failed callback, is a strange callback, since the receiver, not the
> sender would receive that callback. That would make for some confusing
> programming logic. The sender will simply timeout, and not get the reply, and
> has to act accordingly. But I'm not sure the receiver needs to be notified.

I haven't /seen/ any exceptions, but I would assume any exception that can be
thrown by a normal Channel#send could come up here, since a Channel#send is
what happens.
The RpcCallback interface contains both callbacks that are invoked on the
sender (leftOver) and receiver (replyRequest) side. If that isn't confusing, I
don't see why adding one would be. 
The specific case where I felt the need for such a callback was one where I was
shuttling instances across nodes -- by "shuttling" I mean removing them from
one and putting them on the other. I would receive a request through the
RpcChannel for an instance, look it up, unregister it locally, pack it in the
reply, send it. If the sending failed, I would want to re-register it locally,
so as not to lose data.
Whatever you may think of that usage, the fact remains that there's a
discrepancy between requests on an RpcChannel, where the sender is notified of
communication errors, and replies on the RpcChannel, where he (the sender of
the reply) isn't. Even though in both cases it's the same Channel#send that is
performed.

Comment 2 Olivier Costet 2011-01-27 08:49:08 UTC

Created attachment 26561 [details]
svn diff

CAVEAT: this proposal modifies the RpcCallback interface. RpcCallback is
currently implemented by at least four tomcat classes, EchoRpcTest in
org.apache.catalina.tribes.demos, and AbstractReplicatedMap, ReplicatedMap ad
LazyReplicatedMap in org.apache.catalina.tribes.tipis. The attached patch does
not include the implementation of the additional interface method for those
classes, as this coder did not feel qualified to tell how it should be
implemented. They could probably simply be NOOPs, however.

Comment 3 Filip Hanik 2011-01-27 11:33:32 UTC

(In reply to comment #2)
> Created an attachment (id=26561) [details]
> svn diff
> 
> CAVEAT: this proposal modifies the RpcCallback interface. RpcCallback is
> currently implemented by at least four tomcat classes, EchoRpcTest in
> org.apache.catalina.tribes.demos, and AbstractReplicatedMap, ReplicatedMap ad
> LazyReplicatedMap in org.apache.catalina.tribes.tipis. The attached patch does
> not include the implementation of the additional interface method for those
> classes, as this coder did not feel qualified to tell how it should be
> implemented. They could probably simply be NOOPs, however.

I still don't understand this enhancement. What is the possibly use case for implementing replyFailed.
The callback method replyFailed would get called to the receiver. And this is backwards. There is nothing the receiver can do if reply fails. There are no actions to be taken. It wont allow the receiver to retry the attempt. So the method is at best a no-op, but really just a method one has to implement with zero benefit. The sender still has to make a decision on what to do without a reply. 

best
Filip

Comment 4 Olivier Costet 2011-01-27 12:55:27 UTC

(In reply to comment #3)
> I still don't understand this enhancement. What is the possibly use case for
> implementing replyFailed.
> The callback method replyFailed would get called to the receiver. And this is
> backwards. There is nothing the receiver can do if reply fails. There are no
> actions to be taken. It wont allow the receiver to retry the attempt. So the
> method is at best a no-op, but really just a method one has to implement with
> zero benefit. The sender still has to make a decision on what to do without a
> reply. 

To reiterate: 
The specific case where I felt the need for such a callback was one where I was
shuttling objects across nodes -- by "shuttling" I mean removing them from
one and putting them on the other. I would receive a request for a specific object through the RpcChannel, look the object up, unregister it locally, pack it in the reply, send the reply. If the sending failed, I would want to re-register the object locally, so as not to lose data.

What is it that makes you consider this use-case invalid?

Comment 5 Filip Hanik 2011-01-27 16:49:01 UTC

Hi Olivier, yes, I see the use case. I do have a different suggestion for the solution, as there is a significant change. 
I'm attaching  a patch with the summary of

1. Not breaking the existing interfaces
2. Adding better feedback on a failure, and let the replying code decide course of action
3. feedback on success/completion of a reply

Ability to handle async message sending.

Comment 6 Filip Hanik 2011-01-27 16:50:07 UTC

Created attachment 26567 [details]
Patch to include replyFailed functionality

Comment 7 Olivier Costet 2011-01-28 04:40:39 UTC

(In reply to comment #5)
> Hi Olivier, yes, I see the use case. I do have a different suggestion for the
> solution, as there is a significant change. 
> I'm attaching  a patch with the summary of
> 
> 1. Not breaking the existing interfaces
> 2. Adding better feedback on a failure, and let the replying code decide course
> of action
> 3. feedback on success/completion of a reply
> 
> Ability to handle async message sending.

Excellent! And yes, I also think it's better not to modify the existing interface.

Thanks a lot, Filip, for this and for your work in general.

Comment 8 Filip Hanik 2011-01-28 13:31:15 UTC

hi Olivier, I'm still noodling (thinking) about this. I am not thrilled about the 
if (async) else 
case here. But rest assured, a solution will come.

best
Filip

Comment 9 Olivier Costet 2011-01-31 04:46:21 UTC

(In reply to comment #8)
> hi Olivier, I'm still noodling (thinking) about this. I am not thrilled about
> the 
> if (async) else 
> case here. 

I think I see what you mean. I would say the problem is the "retry" capability: since there can't be any retrying in async mode, it's what causing the discrepancy between the sync and async case.
I'm attaching a diff with a suggestion of how it might look like without the option to retry. Perhaps it can be of help. But by all means, do noodle! :)

Comment 10 Olivier Costet 2011-01-31 04:48:03 UTC

Created attachment 26579 [details]
Different suggestion for replyFailed

Comment 11 Filip Hanik 2011-02-08 15:29:22 UTC

hi Olivier, thanks for your patch.
Having two callback objects seems a bit overkill. It's the route I went down originally, but decided against it.
I've checked in a fix, sparked by looking at your code in r1068549
This should satisfy the requirements in terms getting the confirmation in both async and non async mode.

Let me know if this does it!

best
Filip

Comment 12 Olivier Costet 2011-02-09 04:59:37 UTC

Hi Filip,

I would have thought the second callback could be implemented in the same class is the (Extended)RpcCallback, but at any rate it's your call. I got what I wanted out of this, so it's fine by me.

A few remarks on the code:
1. Unless I'm very much mistaken, the retry functionality won't work in async mode. This should be documented in ExtendedRpcCallback#replyFailed.
2. As far as I can see, GroupChannel#send does handle the case where the ErrorHandler is null, so you probably don't need an if/else to call the appropriate Channel#send in RpcChannel#messageReceived. Actually, this would depend on the contract of Channel#send, which I'm not aware of, so I don't know for sure.
3. The call to ExtendedRpcCallback#replySucceeded in the synchronous case should be move outside the try/catch block. Perhaps put it in a try/catch of its own. The way it is now, it would cause spurious behaviour if it were to throw a RuntimeException.

I think that's about it.

Cheers, 
 Olivier.

Comment 13 Filip Hanik 2011-02-09 12:40:16 UTC

(In reply to comment #12)
> Hi Filip,
> 
> I would have thought the second callback could be implemented in the same class
> is the (Extended)RpcCallback, but at any rate it's your call.

It is, the other patch introduced a third class for callbacks that was wrapped in an error handler. 

 I got what I
> wanted out of this, so it's fine by me.

cool

also, the retry functionality has been removed, as it can be configured as an interceptor, or configured further down in the stack.

> 
> A few remarks on the code:
> 1. Unless I'm very much mistaken, the retry functionality won't work in async
> mode. This should be documented in ExtendedRpcCallback#replyFailed.

It should work in async mode, since it passes the EXRPC object into the channel as a feedback object wrapped in an error handler.


> 3. The call to ExtendedRpcCallback#replySucceeded in the synchronous case
> should be move outside the try/catch block. Perhaps put it in a try/catch of
> its own. The way it is now, it would cause spurious behaviour if it were to
> throw a RuntimeException.

yes, for sure, fixed in r1068989


thanks for your feedback
Filip