-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Blocking calls not working as expected in the case of disconnections #610
Comments
Hi @manast. ioredis reconnecting to the server forever, so all commands will be blocking when disconnected. This behavior makes sense when it comes to an application that the connection will recover shortly (<10s~1min). Setting case 1: prints errors immediately. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 7 days if no further activity occurs, but feel free to re-open a closed issue if needed. |
bump |
@luin It's quite strange - I use docker image with redis, istead of the pure local redis server. I tested local redis-server launching and bug doesn't exist in that case (external IP, protected-mode off). We with my colleagues tested reconnection with product-like environment and couldn't repeat. We use kubernetes, may be it somehow affects the bug. I'm not sure that I can investigate the issue further, I tried different redis options in the docker and local with the same result. It happens in docker container, but not with local redis server. |
ok, I try to test again with a reproducible environment. |
We're definitely seeing this issue occur when using an Azure Redis instance. If I scale the service, Azure will disconnect any clients when it cuts over. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 7 days if no further activity occurs, but feel free to re-open a closed issue if needed. |
bump to avoid auto close |
@manast have you been able to come up with a good workaround for this issue? I'm using bull for an internal app I'm building at work. Everything works as expected on my dev box, but I'm running into this issue when I try to configure my app to redis instances on different hosts. |
I use Kubernetes and have seen this issue. I think to re-create what you need to do is establish a healthy connection to redis, then kill your redis server, and send a command to it causing an exception, then start your redis server back up. Non-blocking calls will connect successfully, blocking calls will throw an exception. I can also confirm that this behavior even occurs if your using Sentinels, if you shutdown all of your sentinels the behavior is exactly the same. |
Is this still an active issue? It may explain problems we've been seeing (also in kubernetes). |
I was able to resolve this but it required a TON of tweaking of my redis instances in Kubernetes, and code hacks. So it's solvable with a lot of work. FWIW, I ended up dumping my custom redis configuration and went with the Bitnami Helm chart. I made sure to set Doing those things appears to have fixed this issue entirely, I have not seen it happen in over 6 months. I also changed the redis config options.
In addition, based on my testing I noticed even with those changes it still appears to happen if the redis instance doesn't have enough memory or cpu resources. So I doubled those as well. |
We are having a serious issue in bull (OptimalBits/bull#890), where the queue stops processing commands in the event of disconnections. I have tracked it down to be an issue in ioredis. It seems that blocking commands are not handled properly in the case of disconnections. It is very easy to reproduce, but there are many cases to consider. Here I report the most obvious ones.
Code to reproduce:
Case 1. Disconnect before calling blocking command.
Behaviour
Dangling call, nothing happens for ever.
Expected
Error or at least timeout after given timeout.
Case 2. Connected before calling command, disconnected afterwards.
Behaviour
Dangling call, nothing happens for ever.
Expected
Error or at least timeout after given timeout.
Case 3. Connected before calling blocking command, disconnected and then reconnected.
Behaviour
Dangling call, nothing happens for ever.
Expected
Error or at least timeout after given timeout.
Case 4. Disconnected before calling blocking command, connected afterwards.
Behaviour
Timeout after 10 seconds after reconnection.
Expected
Works as expected?
Since the blocking command is not cancelable (#516), there is currently no workaround I know of for this, and you may end with a dangling client, so I think this issue is quite serious but please lets discuss it.
The text was updated successfully, but these errors were encountered: