RequestAsync doesn't resume after broker restart (2.x) #272

cocowalla · 2017-09-24T20:56:27Z

Sorry, another issue relating to the broker going down!

Using RawRabbit 2.x with request/response, I have an IBusClient that calls RequestAsync. When the broker goes down after at least one request has already been sent, I see these messages logged from RawRabbit:

The existing connection is not open.
Connection is recoverable. Waiting for 'Recovery' event to be triggered.

When the broker comes back up, I see:

Connection has been recovered!

...which sounds very promising, but actually the call to RequestAsync appears hung, as it never returns.

The text was updated successfully, but these errors were encountered:

pardahlman · 2017-09-25T06:07:45Z

Hello, hi! No worries - the issues you raise are relevant for the client, so don't stop 👍

Is your scenario that the broker goes down after a request is sent, before the response is received? And does this occur when a RPC call of the same message types have been successfully performed?

cocowalla · 2017-09-25T10:51:22Z

It's after an RPC call of the same message type has been successfully performed (both sent and received)

pardahlman · 2017-09-29T19:18:14Z

I took a look at this, and it turns out that the issue was mitigated in the consumer factory in a recent commit (7b9a12242).

I tried to reproduce the issue you described by:

completing a RPC
restart the broker
perform a RPC with same messages as in 1

Doing this with the current code in branch 2.0 executed as expected.

cocowalla · 2017-09-29T19:52:20Z

I still have the same issue. Slightly different steps to reproduce:

Complete an RPC
Take down the broker
Start an RPC
Bring the broker back up

The call to RequestAsync from step 3 never completes.

This fix was originally intended for RPC calls not returning response if the broker goes down before the response is sent. However, the issue is that in case of broker goes down and then recovers we need to assure two things: 1. wait until the recovery has taken place 2. sanity check if the channel has been reset, and if so do not ack

pardahlman · 2017-09-29T20:58:36Z

I've got some good news and some bad news 😉

The good news is that I've identified the problem. It was also related to the publish sequence that was being reset when the broker was restarted. RawRabbit didn't check to see if the delivery tag made sense for the channel, and could try to ack on a delivery tag that the broker did not recognize. With a fix for this in place I was able to restart the broker mid-RPC and get the response.

The bad news is that it does seem to wrok with direct RPC (which is what RawRabbit uses by default). I don't know if the "pseudo queue" used for response is configured with auto-delete (and thus removed when the consumer disconnects).

Here's an example of how the request was configured when I got it to work:

await requester.RequestAsync<BasicRequest, BasicResponse>(new BasicRequest(), ctx => ctx
  .UseRequestConfiguration(cfg => cfg
    .ConsumeResponse(r => r
      .Consume(c => c
        .WithRoutingKey("response_key"))
      .FromDeclaredQueue(q => q
        .WithName("response_queue"))
      .OnDeclaredExchange(e => e
        .WithName("response_exchange")
      )
    )
  ), ct: cs.Token);

Two things to note:

Might be a good idea to provide a cancellation token if expecting broker disconnects as recovery takes some time and a cancellation token overrides the default request timeout
The response queue needs to have AutoDelete set to false, otherwise it may be removed before the response is received. In a real-life-scenario, I would probably use a guid to create a unique routing key in order to guarantee that the response is routed to the correct application (in case of multi instance/multi thread etc)

cocowalla · 2017-09-29T21:24:56Z

Thanks for looking into this, I'll give it a try tomorrow!

Regarding a cancellation token, yes, I think what I'll do is use one to timeout requests after a while, then retry. That way, even if the request hangs while the broker is down, it will timeout eventually regardless.

cocowalla · 2017-09-30T21:03:35Z

Excellent, the config you provided works as described!

cocowalla closed this as completed Sep 30, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RequestAsync doesn't resume after broker restart (2.x) #272

RequestAsync doesn't resume after broker restart (2.x) #272

cocowalla commented Sep 24, 2017

pardahlman commented Sep 25, 2017

cocowalla commented Sep 25, 2017

pardahlman commented Sep 29, 2017

cocowalla commented Sep 29, 2017

pardahlman commented Sep 29, 2017

cocowalla commented Sep 29, 2017

cocowalla commented Sep 30, 2017

RequestAsync doesn't resume after broker restart (2.x) #272

RequestAsync doesn't resume after broker restart (2.x) #272

Comments

cocowalla commented Sep 24, 2017

pardahlman commented Sep 25, 2017

cocowalla commented Sep 25, 2017

pardahlman commented Sep 29, 2017

cocowalla commented Sep 29, 2017

pardahlman commented Sep 29, 2017

cocowalla commented Sep 29, 2017

cocowalla commented Sep 30, 2017