Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

procstat input: Add Option to sum all matching processes #2480

Closed
hobeone opened this issue Mar 1, 2017 · 4 comments
Closed

procstat input: Add Option to sum all matching processes #2480

hobeone opened this issue Mar 1, 2017 · 4 comments
Labels
area/procstat feature request Requests for new plugin and for new features to existing plugins

Comments

@hobeone
Copy link

hobeone commented Mar 1, 2017

Feature Request

It would be useful to have the ability to sum the resource usage of all matching processes in the procstat input plugin. Currently if more than one process matches the (and you haven't set pid_tag = true from #1668 ) you will get only one of the processes information saved to influxdb since the tags are identical.

Adding pid_tag=true works around this at the cost of much higher resource usage in Influxdb which may be unnecessary if seeing a sum is what you want anyway.

The changes to the code doesn't seem that large and I'd be happy to make them but I wanted to see if this was something that would be accepted first.

Proposal:

[[inputs.procstat]]
pattern = "chrome"
sum_matches = true # invalid to set this with pid_tag = true

Current behavior:

telegraf -config ./telegraf.conf --debug --test --input-filter procstat

One line per process matched

Desired behavior:

Sum the resource usage of all matched processes.

Use case:

For coarser grained monitoring getting the sum resource usage of a multiprocess/threaded daemon is sufficient.

@sparrc sparrc added this to the Future Milestone milestone Mar 2, 2017
@phemmer
Copy link
Contributor

phemmer commented Mar 2, 2017

What do you propose to do with fields that can't be summed? Such as cpu_nice or memory_rss?

@hobeone
Copy link
Author

hobeone commented Mar 2, 2017

telegraf is only showing "cpu_time_nice" for me. That should be summable though I think. It's a number of ticks if I'm reading gopsutils correctly.

Resident set size is tricky and might not be super useful in this case - though perhaps just taking the max RSS might be okay? What would you suggest?

It has been a while since I've dug into this sort of thing so I could be entirely wrong.

@danielnelson danielnelson removed this from the Future Milestone milestone Jun 14, 2017
@danielnelson
Copy link
Contributor

@hobeone You should be able to do this now with the basicstats aggregator and careful use of the
measurement filtering options to control what is aggregated.

@danielnelson danielnelson added the feature request Requests for new plugin and for new features to existing plugins label Apr 9, 2018
@AsteroidOrangeJuice
Copy link

I'm curious if anyone can provide an example of this. I have this data:

> SELECT "memory_rss", "memory_usage", "cpu_time_user", "cpu_usage", "process_name"::tag, "host"::tag, "pid"::tag FROM "procstat" WHERE  time >= 1729489643000ms AND ("host"::tag = 'myserver') AND "process_name"::tag = 'gunicorn' 
name: procstat
time                memory_rss memory_usage        cpu_time_user cpu_usage           process_name host     pid
----                ---------- ------------        ------------- ---------           ------------ ----     ---

1729489890000000000 87990272   1.0745766162872314  39.22         0.1000449482443498  gunicorn     myserver 576450
1729489890000000000 83382272   1.0183016061782837  60.95         0.20023262269235317 gunicorn     myserver 576483
1729489890000000000 85188608   1.0403614044189453  34.37         0.10011926437044126 gunicorn     myserver 576490
1729489890000000000 90357760   1.1034893989562988  59.03         0.2002228928896298  gunicorn     myserver 576478
1729489890000000000 90087424   1.10018789768219    59.4          0.10001307805018146 gunicorn     myserver 576438
1729489890000000000 89935872   1.0983370542526245  48.03         0.20019208306253883 gunicorn     myserver 576467
1729489890000000000 87633920   1.070224642753601   52.49         0.1000749386760502  gunicorn     myserver 576461
1729489890000000000 89600000   1.0942353010177612  52.36         0.200031443122533   gunicorn     myserver 576437
1729489890000000000 21757952   0.26571783423423767 4.67          0                   gunicorn     myserver 576422
1729489890000000000 87384064   1.0671732425689697  40.03         0.10010247830049032 gunicorn     myserver 576474
1729489890000000000 88846336   1.0850311517715454  40.66         0.10003166747509519 gunicorn     myserver 576440
1729489900000000000 87384064   1.0671732425689697  40.03         0.10000889439101167 gunicorn     myserver 576474
1729489900000000000 88846336   1.0850311517715454  40.67         0.10000649682211071 gunicorn     myserver 576440
1729489900000000000 89935872   1.0983370542526245  48.04         0.10000651238406394 gunicorn     myserver 576467
1729489900000000000 85188608   1.0403614044189453  34.38         0.19997402915289375 gunicorn     myserver 576490
1729489900000000000 87990272   1.0745766162872314  39.23         0.10000613400621736 gunicorn     myserver 576450
1729489900000000000 83382272   1.0183016061782837  60.96         0.09998818594579272 gunicorn     myserver 576483
1729489900000000000 90087424   1.10018789768219    59.41         0.2000135055518983  gunicorn     myserver 576438
1729489900000000000 90357760   1.1034893989562988  59.04         0.09999353176831632 gunicorn     myserver 576478
1729489900000000000 21757952   0.26571783423423767 4.67          0                   gunicorn     myserver 576422
1729489900000000000 87633920   1.070224642753601   52.5          0.10000587378497335 gunicorn     myserver 576461
1729489900000000000 89600000   1.0942353010177612  52.37         0.10000654015768877 gunicorn     myserver 576437
1729489910000000000 87633920   1.070224642753601   52.51         0.09992207007807172 gunicorn     myserver 576461
1729489910000000000 89600000   1.0942353010177612  52.38         0.09992023595387058 gunicorn     myserver 576437
1729489910000000000 85188608   1.0403614044189453  34.39         0.09994133946076598 gunicorn     myserver 576490
1729489910000000000 88846336   1.0850311517715454  40.68         0.09992241952401248 gunicorn     myserver 576440
1729489910000000000 87384064   1.0671732425689697  40.05         0.19984360063947607 gunicorn     myserver 576474
1729489910000000000 90357760   1.1034893989562988  59.04         0.09993410706746556 gunicorn     myserver 576478
1729489910000000000 90087424   1.10018789768219    59.41         0.09992244263818299 gunicorn     myserver 576438
1729489910000000000 87990272   1.0745766162872314  39.24         0.1998452180800923  gunicorn     myserver 576450
1729489910000000000 89935872   1.0983370542526245  48.05         0.09992169584343957 gunicorn     myserver 576467
1729489910000000000 21757952   0.26571783423423767 4.68          0.0999204672549214  gunicorn     myserver 576422
1729489910000000000 83382272   1.0183016061782837  60.97         0.09993949907582839 gunicorn     myserver 576483
> 

Where I have 11 different gunicorn processes running on my machine at once, and I want to sum the total memory percentage gunicorn is taking up. So in this case, I have 11 processes x 3 10 second time intervals, for 33 points. I want to collapse that into 3 points by summing all of them for the same time value, grouping by process_name, regardless of tag.

Tags are required for keeping the series separate (as they are the indexes), and the second I drop the pid tag, I lose the other series. The basicstats aggregator seems good for combining datapoints from multiple time windows across a single series, but I think this is multiple series.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/procstat feature request Requests for new plugin and for new features to existing plugins
Projects
None yet
Development

No branches or pull requests

5 participants