ECONNREFUSED Error when adding Job to Queue #83

pariola · 2019-12-08T23:30:25Z

I created an IORedis instance and initiated a queue,

const Redis = require("ioredis");
const { Queue } = require("bullmq");

const { REDIS_URI } = process.env;
const connection = new Redis(REDIS_URI);

const q = new Queue("base", { connection });

q.add("one", { name: "NAME" });

then it fails with the error below when i try to add a Job to Queue

{ Error: connect ECONNREFUSED 10.59.255.141:6379
    at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1107:14)
  errno: 'ECONNREFUSED',
  code: 'ECONNREFUSED',
  syscall: 'connect',
  address: '10.59.255.141',
  port: 6379 }

While debugging i thought it was my Redis connection then i added the code below before calling Queue.add and it returns the keys successfully, while Queue.add still fails

connection.keys("*", (err, keys) => {
  if (err) return console.log(err);

  console.log(keys);
});

I have tried downgrading to 1.4.3 too

Any help would be greatly appreciated!

The text was updated successfully, but these errors were encountered:

eric-hc · 2019-12-10T16:18:04Z

Is your app running in a Docker container? Right now I'm having an issue, but it only refuses connection when the API is running in its own container. I think it's similar to this ioredis issue that has a few potential solutions though.

pariola · 2019-12-11T00:32:09Z

Yeah, my app is running in a kubernetes cluster

elucidsoft · 2019-12-21T03:51:54Z

I have mine also running in Kubernetes with my Redis setup with Redis Sentinel and 3 instance. It works great for a while. I use to get this exact error before setting up Sentinel, so I figured I would setup Sentinel to make it more redundant. Now, it's all working but there is some truly strange weirdness that I can't figure out.

It works fine as long as Redis is up and running when you first connect via IORedis.
If Redis goes down, or if ALL sentinels are taken down AND you try to use the connection you get an IORedis error that it can't connect which is expected.
If you take Redis back up, and try again you continue to get the error it can't connect, even though in the error message it states it will try to reconnect, it never succeeds at reconnecting a connection that died after initial connection.
This is where things get weird, if you call getJobStatus or clean, etc. at Step 3 those work and return data and connect to redis. Even though if you repeat Step 3, it still throws an exception that it can't connect. I have verified over and over again that I am using same connection, definitely using same connection and ioredis instance. It's just beyond bizarre.
If you restart the container all is fine again until you mess up the connection again like in Step 2.

What I don't know if this behavior is specific to IORedis or if its some odd way BullMQ is using it? I don't know tbh.

manast · 2019-12-21T09:56:48Z

@elucidsoft my gut feeling is that this is an ioredis issue, it would be great if you can verify it just with a simple redis example. I already posted an issue regarding connection in the ioredis repo:
redis/ioredis#610

elucidsoft · 2019-12-21T14:16:40Z

Is there a call I can use for my health checks other than queue.add that would cause this? Would love a method called isHealthy() would make doing health checks a freaking breeze with Bull. I think I'll put in a request for this.

elucidsoft · 2020-02-23T01:35:30Z

This a a non-issue, after months of testing, troubleshooting, and refining here is the deal. If redis is terminated forcefully, for instance if your on a preemptive node or a spot instance type of situation and your redis instance is given zero notice for graceful shutdown you can end up in this situation. But it has nothing to do with BullMq. It's an odd issue with connection state with Redis, when this happens it seems when Redis comes back up it performs some actions to restore it's state, etc. During that time, IORedis will be connecting to it, those connections that connect to it before Redis is ready end up only able to perform a very limited set of actions for whatever reason, and IORedis is not capable of recovering from this situation. The only way out is a full restart of the application, some internal state in IORedis is preventing a recovery of this state.

In any case, I have not had this happen to me in several months now, so I am 100% stable. If the clients get killed it does not affect anything. You will see this error occasionally when that happens, but in all instances I've seen a 100% full recovery in a matter of seconds. Morale of the story, don't put StatefulSets on volatile node types. In fact, if you read Kubernetes documentation, they actually mention this being unsupported.

pariola changed the title ~~ECONNREFUSED Error when adding to Queue~~ ECONNREFUSED Error when adding Job to Queue Dec 8, 2019

elucidsoft mentioned this issue Dec 21, 2019

Add Health Check Call #95

Closed

elucidsoft mentioned this issue Jan 2, 2020

REDIS Timeout Issue OptimalBits/bull#1602

Open

roggervalf closed this as completed Sep 29, 2021

manast added a commit that referenced this issue Dec 13, 2021

GitBook: [#83] docs: add initial documentation for groups rate limiting

4b59d05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ECONNREFUSED Error when adding Job to Queue #83

ECONNREFUSED Error when adding Job to Queue #83

pariola commented Dec 8, 2019

eric-hc commented Dec 10, 2019

pariola commented Dec 11, 2019

elucidsoft commented Dec 21, 2019

manast commented Dec 21, 2019

elucidsoft commented Dec 21, 2019

elucidsoft commented Feb 23, 2020

ECONNREFUSED Error when adding Job to Queue #83

ECONNREFUSED Error when adding Job to Queue #83

Comments

pariola commented Dec 8, 2019

eric-hc commented Dec 10, 2019

pariola commented Dec 11, 2019

elucidsoft commented Dec 21, 2019

manast commented Dec 21, 2019

elucidsoft commented Dec 21, 2019

elucidsoft commented Feb 23, 2020