-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[azure.servicebus] Receiving messages with lock that has already expired #6178
Comments
Thanks @JosPolfliet! You can alter the number of "cached" messages by setting the "prefetch" value on "get_receiver": messages = queue_client.get_receiver(prefetch=1)
for message in messages:
message.complete() |
Thank you very much for your reply. From both the docs and the source code, it seems that Per your suggestion, I tried with |
Hi I am a colleague of @JosPolfliet working on this issue. I implemented the prefetch = 1, and as anticipated, this did not have any impact. An extract from the logs is below. As you can see, processing the first message takes less than 2 minutes. However, the following messages that are received have expired already. The time to live is set to 10675199 days, hence it cannot be due to reaching the time to live. Is there anything else we can check? `2019-07-09 09:11:52 INFO adp Downloading new messages from preprocessrequest 2019-07-09 09:11:52 DEBUG azure.servicebus.common.utils Running lock auto-renew thread for 3600 seconds . 2019-07-09 09:11:52 DEBUG azure.servicebus.common.utils 30 seconds or less until lock expires - auto renewing. 2019-07-09 09:11:52 DEBUG adp Message locked until: 2019-07-09 09:12:52.290000 . 2019-07-09 09:12:22 DEBUG azure.servicebus.common.utils 30 seconds or less until lock expires - auto renewing. 2019-07-09 09:12:52 DEBUG azure.servicebus.common.utils 30 seconds or less until lock expires - auto renewing. 2019-07-09 09:13:22 DEBUG azure.servicebus.common.utils 30 seconds or less until lock expires - auto renewing. 2019-07-09 09:13:35 DEBUG adp Finished processing. Marking message as complete . 2019-07-09 09:13:35 DEBUG adp Closing AutoLockRenew . 2019-07-09 09:13:35 INFO adp Downloading new messages from preprocessrequest . 2019-07-09 09:13:35 INFO adp Processing new message from preprocessrequest . 2019-07-09 09:13:35 WARNING adp Received an expired message so skipping it. Was locked until: 2019-07-09 09:12:47.273000. Message contents: {"filePath": "19331272_3_aanbod_pdf.pdf", "fileId": "5d243f7df1d22433e26a474c", "uploadId": "5d243f7df1d22433e26a474c", "organisationId": "5d11ee7807a226ae21ec504f"} . 2019-07-09 09:13:35 INFO adp Downloading new messages from preprocessrequest . |
I am seeing something similar but where the receiver times out and closes after 5 or more minutes (using version 0.50.1 under WSL 1 with Debian). The documentation says that the default is that the receiver stays open until closed since the idle timeout is 0 by default. But that is not what we experience when using ReceiveAndDelete mode. You really need to provide some samples where a receiver gets one message from the queue say each hour or over a long period past some of the timeouts that exist in your code. The samples I have seen never deal with this. There is also very little documentation on using ReceiveAndDelete mode using Python. |
It's not exactly hard to recreate either:
|
Same issue there and didn't fount any way to fix it |
Sorry for the delay in investigating this.
This sequence seems to show the correct order if @nemakam - are you able to provide any insight into why a message would arrive at the client where the 'locked-until' field has already expired? |
|
Hi @JosPolfliet, @nicolasgere @alvercau - I have identified the root issue behind this and am working on a fix for it now. |
Great, look forward to it.
…-- Bob
On Sep 26, 2019, at 7:03 PM, annatisch ***@***.***> wrote:
Hi @JosPolfliet, @nicolasgere @alvercau - I have identified the root issue behind this and am working on a fix for it now.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
That's really good news. Thank you!
On Fri, 27 Sep 2019 at 04:29, Robert Weiner <notifications@github.com>
wrote:
… Great, look forward to it.
-- Bob
> On Sep 26, 2019, at 7:03 PM, annatisch ***@***.***> wrote:
>
> Hi @JosPolfliet, @nicolasgere @alvercau - I have identified the root
issue behind this and am working on a fix for it now.
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub, or mute the thread.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#6178?email_source=notifications&email_token=AF3PKSVBM6AVDASPWT5T3RDQLVVZLA5CNFSM4H4MWU4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7XQTEI#issuecomment-535759249>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AF3PKSS7HRMG6BUH7KKZ423QLVVZLANCNFSM4H4MWU4A>
.
--
Aleksandra Vercauteren
|
I have a PR open with a potential fix here: And preview builds of uAMQP can be download here: The fix for this issue requires no changes to the Service Bus SDK. |
Will do and will get back to you.
…-- Bob
On Sep 27, 2019, at 12:52 PM, annatisch ***@***.***> wrote:
I have a PR open with a potential fix here:
Azure/azure-uamqp-python#96
And preview builds of uAMQP can be download here:
https://dev.azure.com/azure-sdk/public/_build/results?buildId=119291
The fix for this issue requires no changes to the Service Bus SDK.
So it would be great if you could try out the updated uAMQP to see whether it resolves the issue for you.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
This should now be resolved in the updated uAMQP (v1.2.3), which has just been released. You should be able to install with:
|
Hey folks, in leiu of hearing any recurring issues with this, closing this out. Don't hesitate to reopen if this problem is still occuring. |
I'm still facing this issue. The tricky thing - I can reproduce the issue only during receiving messages from topic with high load.
And the result is:
I have tried to debug and figured out that '_received_messages' queue in uamqp/client.py fills up pretty quickly (it can contain hundreds of messages) compared to message handling and it doesn't depend on specified 'prefetch'. On a topic with quite low load everything works as expected. One more interesting fact: enqueued_time is the same for these messages azure-servicebus==0.50.3 |
We're experiencing a similar case with our servicebus lately. We have multiple producers and multiple consumers running simultaneously and we observed occasions where the same message is delivered to a second consumer, but then with an already expired lock (we check
As you can see there, the second consumer receives the same message as the first one (way after it already got processed and completed) but immediately (~3ms) fails due to the expired lock. We're only experiencing this when our system is under heavy cpu load, but we were not able to reproduce it artificially. We're using @KieranBrantnerMagee I know this isn't much information, but is there a chance to reopen or reassess this? If you need additional specific info or have an idea on how we might be able to reproduce this, please let me know! |
In our application logs, we see that we sometimes receive messages where the lock is already expired.
This happens both when calling the native
next()
function or using a loop that executesfetch_batch
while explicitly disabling prefetching by puttingmax_batch_size=1
.Usually we get something like this
but sometimes we get expired messages
As far as I can tell, this never happens in the first message, but if the first message takes a long time to process (e.g. longer than the original lock time should it have been locked at the same time as the first message), it does sometimes happen in subsequent messages.
Relevant code:
What could be the problem here? Are we doing something wrong?
Relevant pip freeze output
The text was updated successfully, but these errors were encountered: