Intermittent Error while retrieving data from rabbitHost statusCode=0 #120

ufou · 2019-04-10T14:11:58Z

Hi,

We are using the rabbitmq-ha helm chart which also deploys the exporter to our cluster, metrics are sometimes working - our dashboard jumps from being OK to not OK quite frequently, in the rabbitmq exporter logs we are getting some errors:

time="2019-04-10T14:03:50Z" level=info msg="Metrics updated" duration=36.638611ms
time="2019-04-10T14:04:00Z" level=info msg="Metrics updated" duration=114.957246ms
time="2019-04-10T14:04:10Z" level=error msg="Error while retrieving data from rabbitHost" error="Get http://localhost:15672/api/exchanges?sort=: dial tcp [::1]:15672: connect: cannot assign requested address" host="http://localhost:15672" statusCode=0
time="2019-04-10T14:04:10Z" level=error msg="Error while retrieving data from rabbitHost" error="Get http://localhost:15672/api/nodes?sort=: dial tcp [::1]:15672: connect: cannot assign requested address" host="http://localhost:15672" statusCode=0
time="2019-04-10T14:04:10Z" level=info msg="Metrics updated" duration=26.835083ms
time="2019-04-10T14:04:20Z" level=info msg="Metrics updated" duration=53.588805ms
time="2019-04-10T14:04:30Z" level=error msg="Error while retrieving data from rabbitHost" error="Get http://localhost:15672/api/exchanges?sort=: dial tcp [::1]:15672: connect: cannot assign requested address" host="http://localhost:15672" statusCode=0
time="2019-04-10T14:04:30Z" level=info msg="Metrics updated" duration=92.947296ms
time="2019-04-10T14:04:40Z" level=info msg="Metrics updated" duration=38.490507ms```

We are running 3 replicas - so for each pod there are 2 containers, one rabbitmq-ha and the other rabbitmq-ha-exporter - I wondered if this might be something to do with the errors?

I realise this may not be a fault with this exported, but thought you may be able to shed some light on the issue?

Many thanks!

The text was updated successfully, but these errors were encountered:

ufou · 2019-04-16T16:08:10Z

@kbudde is this repo still active?

kbudde · 2019-04-19T19:55:00Z

@ufou Yes, this repo is still active. Just the response times are not always perfect.

It's totally normal to run one exporter as companion for every rabbitmq host.
The error normally indicates that there are too many open connections. I've never seen this issue.

What rabbitmq version and what rabbitmq exporter version are you using?
You could try to gather the metrics less often (e.g. just once a minute) to check if the issue disappears.

ufou · 2019-04-19T20:52:14Z

@kbudde thanks, I'll try decreasing the scrape frequency - I also thought the same - that the system was unable to open connections due to a limit being reached - but looking at the number of open connections - it's hovering around 90 which really is nothing at all...

I'm running rabbitmq version 3.7.14 (though I've actually tried versions down to 3.7.6 to see if it was something in a recent version - the problem occurred with all versions tested), I'm running exporter version 0.29.0, again I've tested with older versions but the problem still occurs - I've also tried 1.0.0-RC4 but still no joy. My hunch is this is an environmental problem with kubernetes, I just cannot figure out what is causing it (hence hope you might have seen similar) - thanks for getting back to me anyway!

ufou · 2019-04-19T21:02:07Z

one other thing I noticed that it could be ipv6 related:

dial tcp [::1]:15672: connect: cannot assign requested address

When a rabbitmq pod gets restarted I get more legitimate looking errors in the exported logs:

dial tcp 127.0.0.1:15672: connect: connection refused

gaizeror · 2019-04-20T08:27:40Z

Maybe your username and password are wrong? Or the user doesn't have the required permissions

ufou · 2019-04-22T06:21:43Z

@gaizeror I wish it were that simple, that would not explain the intermittent nature of the errors:

time="2019-04-10T14:04:30Z" level=error msg="Error while retrieving data from rabbitHost" error="Get http://localhost:15672/api/exchanges?sort=: dial tcp [::1]:15672: connect: cannot assign requested address" host="http://localhost:15672" statusCode=0
time="2019-04-10T14:04:30Z" level=info msg="Metrics updated" duration=92.947296ms
time="2019-04-10T14:04:40Z" level=info msg="Metrics updated" duration=38.490507ms

Note: metrics are being successfully scraped - I see them in a grafana dashboard I created - but the dashboard data flips between being accurate and inaccurate because of the errors

ufou · 2019-04-23T07:47:58Z

I found a work around, setting an env variable on the exporter container to force ipv4 seems to work:

RABBIT_URL: http://127.0.0.1:15672

Thanks for looking into this!

as seen in #120 the exporter tries to fetch api with ipv6 in kubernetes switched rabbiturl by default to 127.0.0.1 instead of localhost

ufou closed this as completed Apr 23, 2019

kbudde added a commit that referenced this issue Apr 23, 2019

use ipv4 for api by default

c40deb4

as seen in #120 the exporter tries to fetch api with ipv6 in kubernetes switched rabbiturl by default to 127.0.0.1 instead of localhost

desaintmartin mentioned this issue Sep 12, 2019

[stable/rabbitmq] Metrics: use ipv4 to connect to rabbit. helm/charts#17092

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Intermittent Error while retrieving data from rabbitHost statusCode=0 #120

Intermittent Error while retrieving data from rabbitHost statusCode=0 #120

ufou commented Apr 10, 2019

ufou commented Apr 16, 2019

kbudde commented Apr 19, 2019

ufou commented Apr 19, 2019 •

edited

Loading

ufou commented Apr 19, 2019

gaizeror commented Apr 20, 2019

ufou commented Apr 22, 2019

ufou commented Apr 23, 2019

Intermittent Error while retrieving data from rabbitHost statusCode=0 #120

Intermittent Error while retrieving data from rabbitHost statusCode=0 #120

Comments

ufou commented Apr 10, 2019

ufou commented Apr 16, 2019

kbudde commented Apr 19, 2019

ufou commented Apr 19, 2019 • edited Loading

ufou commented Apr 19, 2019

gaizeror commented Apr 20, 2019

ufou commented Apr 22, 2019

ufou commented Apr 23, 2019

ufou commented Apr 19, 2019 •

edited

Loading