Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Telegraf seems to have a memory leak #111

Closed
mattk42 opened this issue Jun 17, 2016 · 9 comments
Closed

Telegraf seems to have a memory leak #111

mattk42 opened this issue Jun 17, 2016 · 9 comments

Comments

@mattk42
Copy link

mattk42 commented Jun 17, 2016

In my cluster running the 2.0 official release I have noticed that telegraf works its way up to 2GB of memory usage before it seemingly gets cleaned up and starts over.

Memory Leak

@jchauncey
Copy link
Member

Are you noticing the pod restarting a lot? There was an issue in the 0.13.x version of telegraf that caused the binary to panic when sending data to influx - influxdata/telegraf#1268

Check the pod logs and see if you see a similar erorr message.

@mattk42
Copy link
Author

mattk42 commented Jun 17, 2016

Unfortunately I actually killed off the DS, I have my own monitoring stuff in place so I shut down most of the deis-monitor components this morning.

I don't believe that the container is getting restarted though, if that was the case the container ids in the chart above would have changed.

@titilambert
Copy link

Hello, I think I got the same issue.
I suspect prometheus plugin. It seems it doesn't closing connection :/
Run on you apiservers: netstat -ntp | grep 8080 | wc -l

@jchauncey
Copy link
Member

So we have seen this happen (especially on larger clusters) so we disabled the prometheus plugin by default in the image (although the chart turns it on). This means you will lose out on k8s metrics and container metrics. I will open an issue with telegraf and see if we can get it fixed.

@jchauncey
Copy link
Member

See here - influxdata/telegraf#1405

@titilambert
Copy link

PR influxdata/telegraf#1406 created

@jchauncey
Copy link
Member

So I have rebuilt the image to include latest master changes. It seems to have fixed the memory leak problem but Im not 100% on that. If you want to redeploy telegraf and check it out that would be awesome.

@titilambert
Copy link

@jchauncey testing today or tomorrow. I let you know when I get results ;)

@titilambert
Copy link

The issue is fix for me !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants