Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Event Hubs] Client throws error when closing due to failure in closing the management link #9615

Closed
ramya-rao-a opened this issue Jun 18, 2020 · 4 comments · Fixed by #9998
Assignees
Labels
Client This issue points to a problem in the data-plane of the library. Event Hubs
Milestone

Comments

@ramya-rao-a
Copy link
Contributor

Run the below code a few times.

const { EventHubConsumerClient } = require("@azure/event-hubs");

async function main() {
  const client = new EventHubConsumerClient("$Default", connectionString, "my-hub");
  const partitionsIds = await client.getPartitionIds();
  console.log(partitionsIds);

  let subscription;
  await new Promise((resolve, reject) => {
    subscription = client.subscribe(
      "0",
      {
        processEvents: async () => {
          resolve();
        },
        processError: async (err) => {
          reject(err);
        },
      },
      {
        maxWaitTimeInSeconds: 0,
      }
    );
  });
  await subscription.close();

  await client.close().catch((err) => {
    console.log("error from closing the client", err);
  });
  console.log("client closed");
}

main();

Most of the time it would succeed, but once in a while you will see the below error from client.close()

An error occurred while closing the management session: undefined
    at ManagementClient.<anonymous> (/Users/ramyar/Documents/GitRepos/test/node_modules/@azure/event-hubs/dist/index.js:431:23)
    at Generator.throw (<anonymous>)
    at rejected (/Users/ramyar/Documents/GitRepos/test/node_modules/tslib/tslib.js:113:69)
    at process._tickCallback (internal/process/next_tick.js:68:7)

This does not happen if you remove the code for client.getPartitionIds() or change the maxWaitTimeInSeconds to a value greater than 0

On digging deeper, I found the below:

On console logging all other places in the context that could have the right error, I found that context.sender.error indeed has the error which is

c {
  value:
   [ Typed { type: [TypeDesc], value: 'amqp:not-found' },
     Typed {
       type: [TypeDesc],
       value:
        'The session channel \'0\' cannot be found in connection \'in-connection2462317(ramyar-eh-test)\'.' },
     Typed { type: [TypeDesc], value: null } ] }

This error seem to be coming from the service and is not something that the client or any of its dependencies throw. I might be wrong.

I don't see this when changing the maxWaitTimeInSeconds to a value greater than 0, so it tells me that this must be some kind of race condition, but I fail to see how this can affect the management link

@ghost ghost added the needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. label Jun 18, 2020
@ramya-rao-a ramya-rao-a added Client This issue points to a problem in the data-plane of the library. Event Hubs labels Jun 18, 2020
@ghost ghost removed the needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. label Jun 18, 2020
@ramya-rao-a ramya-rao-a added this to the Backlog milestone Jun 18, 2020
@ramya-rao-a
Copy link
Contributor Author

Logged amqp/rhea-promise#54 for follow up in rhea-promise

cc @chradek, @richardpark-msft, @HarshaNalluru

@chradek
Copy link
Contributor

chradek commented Jul 8, 2020

Early update:

Still investigating, but did find something else that's interesting. When I run the sample code above and comment out the 1st client.getPartitionIds() call, the process isn't closing. I can see there are 2 timers (a 2 minute and 1 minute timer, both in rhea) and a socket that's open.

@chradek
Copy link
Contributor

chradek commented Jul 8, 2020

I haven't dug too deep into why the above error is happening, but I did notice something odd with the RequestResponseLink.close() implementation:

async close(): Promise<void> {
await this.sender.close();
await this.receiver.close();
await this.session.close();
}

Note that we call close on the sender and receiver before closing the session.

rhea-promise accepts an option closeSession and if it is not set will attempt to close the session after the sender is closed. Since we're closing the session ourselves, we should be passing a value of false for closeSession. When I did this, the above error went away.

Now, I'm not yet sure why this causes an error in sender.close since the session should be closed only after the sender is done closing. I would have expected sender.close() to work and possibly receiver.close() to fail instead. I'll need to investigate a little more why the error is being thrown in sender.close so I can be more sure if this change actually fixes the underlying issue or just improves it.

@ramya-rao-a
Copy link
Contributor Author

rhea-promise accepts an option closeSession and if it is not set will attempt to close the session after the sender is closed

This was done in amqp/rhea-promise#44
But before adding this option, the default behavior was indeed to close the session as well.
This PR provided a way to opt out of it

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Client This issue points to a problem in the data-plane of the library. Event Hubs
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants