-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Blocking not working properly in case of reconnections #1285
Comments
Hey @manast, thanks for raising this. Actually, the code works as expected on my side:
The expected result is when the server is down, ioredis should try to reconnect (according to Did I miss anything? |
ok, this is interesting. After reading your response I tried with a redis server running natively in my mac instead of using a docker container as before, and now I get the same result as you did, how is it possible? |
Could you provide some instructions for a reproducible setup? Ex the cli command to run the docker container. |
|
No I did not get the MaxRetriesPerRequestError either. |
I also experienced the same issue with bullmq causing memory leaks and high cpu load. And the while loop was the reason in some cases. There might really a bug in ioredis in the reconnecting state, but I haven't had the issue anymore since I checked the states manually (i also migrated to redis streams in the meanwhile): function waitForReconnect (client) {
let reconnectListener
const promise = new Promise((resolve) => {
readyListener = function () {
client.removeListener('ready', reconnectListener)
client.removeListener('end', reconnectListener)
resolve()
}
client.once('ready', reconnectListener)
client.once('end', reconnectListener)
})
promise.resolve = reconnectListener
return promise
} And adopt it in the loop: // call reconnectListener.resolve() during server/queue shutdown
let reconnectListener = undefined
while (true) {
if (client.status === 'end') return
try {
console.log('going to block');
const value = await client.brpoplpush('a', 'b', 4);
console.log('unblocked', value);
} catch (err) {
// if redis got closed (by a stopping server, we can just return as the queue should be shut down)
if (client.status === 'end') return
console.error('ERROR', err);
switch (client.status) {
case 'wait':
case 'connecting':
case 'reconnecting':
reconnectListener = waitForReconnect(client)
await readyListener
readyListener = undefined
}
}
} ah, and with that code, I'd configure a low maxRetry config as you don't need to rely on ioredis to handle the errors. |
Anyone that can reproduce this issue can you enable the debug log ( |
@luin like mentioned here by @manast, it's happening for me as well on Redis instance inside Docker on Mac.
We are facing an issue of jobs getting stuck on a few productions instances too (non docker), I am guessing it could be because of the same but I am not sure. Any thoughts ? |
@luin attaching logs with |
I manage to reproduce this issue : it occurs when retryStrategy is returning values less than 50 (exp(0), exp(1), exp(2) in the code snippet below, BullMQ default) Cannot reproduce with ioredis default retryStrategy
Output :
|
There is this older issue #610 also created by me. But since this is rather important I would like to create a new one with a very specific code to reproduce the issue easily so that it can be resolved once and for all :).
This is the code:
How to reproduce
Just run the code above with a local redis server. The while loop will run forever outputting the following:
It just blocks for up to 4 seconds, then unblocks, and so on.
Now, while this program is running just stop the redis server, wait a couple of seconds and start it again:
Output would look like this:
And thats all. The while loop will not continue running as if the call to
client.brpoplpush
has hanged forever.Expected results
I expect that as soon as the the client disconnects the call to
client.brpoplpush
rejects the promise with a connection error. Client code should be able to handle calling again to this blocking command.I am a bit surprised no one else has reported this issue, I wonder if there is some wrong expectation from my side or if I am using the library incorrectly, if so please let me know since this issue is quite severe for users of Bull/BullMQ libraries.
The text was updated successfully, but these errors were encountered: