-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] [Microsoft.Azure.ServiceBus] Closing MessageReceiver Does not Always Close inner ReceivingAmqpLink #16994
Comments
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @axisc. Issue Details
|
Thank you for your feedback. Tagging and routing to the team best able to assist. |
//fyi: @JoshLove-msft |
I highly suspect this issue is due to the Microsoft.Azure.Amqp In my linked repo you can see the (failing) test here >> https://github.com/paulsavides/ServiceBusTesting/blob/4cf80ba2f8e2d725b2d1923600d4aba3f26ee9b3/InterestingTests/SingletonTests.cs#L12-L38 Although the singleton is closed & subsequent In the usage here, I believe the following two sections are being run in such an order to cause the issue I've replicated in that unit test. During periods of transient errors, this section may enter the workflow to recreate the message receiver. azure-sdk-for-net/sdk/servicebus/Microsoft.Azure.ServiceBus/src/Core/MessageReceiver.cs Lines 1078 to 1082 in d2bcf77
If the receiver.CloseAsync() is called after entering the callback here >> azure-sdk-for-net/sdk/servicebus/Microsoft.Azure.ServiceBus/src/Core/MessageReceiver.cs Line 1557 in d2bcf77
It seems like it would be possible to end up with an orphaned receiver. |
/cc @xinchen10 to confirm the issue in the AMQP library. |
Hello, after upgrading my reproduction projects to Microsoft.Azure.Amqp 2.4.9 I could no longer reproduce the issue. See https://github.com/Azure/azure-amqp/releases/tag/v2.4.9 Upgrading that library should be enough to close this issue I believe. |
@paulsavides Thanks for confirming this. I'm closing the issue for now. |
@axisc as is, the Microsoft.Azure.ServiceBus package should still exhibit the issue. Updating the dependent version of Microsoft.Azure.Amqp to 2.4.9 in the azure-sdk would make sure it was fixed in here as well. |
Updating the repo dependency to 2.4.9 in #17290. Whenever the Service Bus library is next released it will contain the updated dependency. |
Thank you Josh! |
Thank you for finding and fixing this issue! |
@paulsavides we have released a new nuget library with updated Nuget dependency 5.1.1, https://www.nuget.org/packages/Microsoft.Azure.ServiceBus/, can you see if this issue goes away? |
Hello @DorothySun216, I have update my reproduction repo listed above to version 5.1.1 and was no longer able to reproduce the issue. Additionally, we have been directly using Microsoft.Azure.ServiceBus v2.4.9 in our services for around two months now and have not seen the issue reproduce. Thank you, have a wonderful day! |
Describe the bug
During periods when a large number of server side errors occur, we sometimes see messages getting "stuck" in queues. As in, they hang in the queues for the configured message lock timeout before being redelivered.
After some attempts to reproduce with a smaller example, I have found that, in certain scenarios, when calling MessageReceiver.CloseAsync(), the inner ReceivingAmqpLink is not actually closed. So, the link is sitting there in the background, continually picking up messages
The only way I was able to get this to reproduce is by closing & opening a new receiver when receiving an error on the ExceptionHandler. My best guess to why this issue occurs; when the inner link faults, it will auto-recover in OnReceiveAsync(). Perhaps there is a race condition with auto recovery & closing the receiver at similar times.
Of course, perhaps there is something completely off with the usage of the sdk here as well.
Expected behavior
In general, I would expect that CloseAsync() would always close the inner ReceivingAmqpLink.
To Reproduce
Reproduction Repo = https://github.com/paulsavides/ServiceBusTesting
ReproProject
is the project that reproduces this issue. If the code is doing something extremely incorrect, please let me know. We are actually using theMassTransit
library to interact with AzureServiceBus so I had to recreate a bit of what it was doing that reproduces the error.Auto-delete after idle
setting.d
to print out diagnostics on all of the links from "closed" receivers that are still open & the number of unsettled messages from those linksEnvironment:
Please let me know if you require any clarification from me.
Thank you for taking the time to look into this,
Paul Savides
The text was updated successfully, but these errors were encountered: