Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent Error while retrieving data from rabbitHost statusCode=0 #120

Closed
ufou opened this issue Apr 10, 2019 · 7 comments
Closed

Intermittent Error while retrieving data from rabbitHost statusCode=0 #120

ufou opened this issue Apr 10, 2019 · 7 comments

Comments

@ufou
Copy link

ufou commented Apr 10, 2019

Hi,

We are using the rabbitmq-ha helm chart which also deploys the exporter to our cluster, metrics are sometimes working - our dashboard jumps from being OK to not OK quite frequently, in the rabbitmq exporter logs we are getting some errors:

time="2019-04-10T14:03:50Z" level=info msg="Metrics updated" duration=36.638611ms
time="2019-04-10T14:04:00Z" level=info msg="Metrics updated" duration=114.957246ms
time="2019-04-10T14:04:10Z" level=error msg="Error while retrieving data from rabbitHost" error="Get http://localhost:15672/api/exchanges?sort=: dial tcp [::1]:15672: connect: cannot assign requested address" host="http://localhost:15672" statusCode=0
time="2019-04-10T14:04:10Z" level=error msg="Error while retrieving data from rabbitHost" error="Get http://localhost:15672/api/nodes?sort=: dial tcp [::1]:15672: connect: cannot assign requested address" host="http://localhost:15672" statusCode=0
time="2019-04-10T14:04:10Z" level=info msg="Metrics updated" duration=26.835083ms
time="2019-04-10T14:04:20Z" level=info msg="Metrics updated" duration=53.588805ms
time="2019-04-10T14:04:30Z" level=error msg="Error while retrieving data from rabbitHost" error="Get http://localhost:15672/api/exchanges?sort=: dial tcp [::1]:15672: connect: cannot assign requested address" host="http://localhost:15672" statusCode=0
time="2019-04-10T14:04:30Z" level=info msg="Metrics updated" duration=92.947296ms
time="2019-04-10T14:04:40Z" level=info msg="Metrics updated" duration=38.490507ms```

We are running 3 replicas - so for each pod there are 2 containers, one rabbitmq-ha and the other rabbitmq-ha-exporter - I wondered if this might be something to do with the errors?

I realise this may not be a fault with this exported, but thought you may be able to shed some light on the issue?

Many thanks!
@ufou
Copy link
Author

ufou commented Apr 16, 2019

@kbudde is this repo still active?

@kbudde
Copy link
Owner

kbudde commented Apr 19, 2019

@ufou Yes, this repo is still active. Just the response times are not always perfect.

It's totally normal to run one exporter as companion for every rabbitmq host.
The error normally indicates that there are too many open connections. I've never seen this issue.

What rabbitmq version and what rabbitmq exporter version are you using?
You could try to gather the metrics less often (e.g. just once a minute) to check if the issue disappears.

@ufou
Copy link
Author

ufou commented Apr 19, 2019

@kbudde thanks, I'll try decreasing the scrape frequency - I also thought the same - that the system was unable to open connections due to a limit being reached - but looking at the number of open connections - it's hovering around 90 which really is nothing at all...

I'm running rabbitmq version 3.7.14 (though I've actually tried versions down to 3.7.6 to see if it was something in a recent version - the problem occurred with all versions tested), I'm running exporter version 0.29.0, again I've tested with older versions but the problem still occurs - I've also tried 1.0.0-RC4 but still no joy. My hunch is this is an environmental problem with kubernetes, I just cannot figure out what is causing it (hence hope you might have seen similar) - thanks for getting back to me anyway!

@ufou
Copy link
Author

ufou commented Apr 19, 2019

one other thing I noticed that it could be ipv6 related:

dial tcp [::1]:15672: connect: cannot assign requested address

When a rabbitmq pod gets restarted I get more legitimate looking errors in the exported logs:

dial tcp 127.0.0.1:15672: connect: connection refused

@gaizeror
Copy link

Maybe your username and password are wrong? Or the user doesn't have the required permissions

@ufou
Copy link
Author

ufou commented Apr 22, 2019

@gaizeror I wish it were that simple, that would not explain the intermittent nature of the errors:

time="2019-04-10T14:04:30Z" level=error msg="Error while retrieving data from rabbitHost" error="Get http://localhost:15672/api/exchanges?sort=: dial tcp [::1]:15672: connect: cannot assign requested address" host="http://localhost:15672" statusCode=0
time="2019-04-10T14:04:30Z" level=info msg="Metrics updated" duration=92.947296ms
time="2019-04-10T14:04:40Z" level=info msg="Metrics updated" duration=38.490507ms

Note: metrics are being successfully scraped - I see them in a grafana dashboard I created - but the dashboard data flips between being accurate and inaccurate because of the errors

@ufou
Copy link
Author

ufou commented Apr 23, 2019

I found a work around, setting an env variable on the exporter container to force ipv4 seems to work:

RABBIT_URL: http://127.0.0.1:15672

Thanks for looking into this!

@ufou ufou closed this as completed Apr 23, 2019
kbudde added a commit that referenced this issue Apr 23, 2019
as seen in #120 the exporter tries to fetch api with ipv6 in kubernetes
switched rabbiturl by default to 127.0.0.1 instead of localhost
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants