-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cpufreq plugin #4215
Cpufreq plugin #4215
Conversation
Update from orifinal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @R4scal. I haven't looked through the code closely yet but have a few ideas based on the README.
plugins/inputs/cpufreq/README.md
Outdated
# throttles_per_socket = false | ||
## Gather CPU throttles per physical core | ||
## Defaults: | ||
# throttles_per_core = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we won't need these two option to enable/disable per core and socket, let's just have them be always on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I agree with a single option. But if we default to on, the code should be adjusted to not throw an error if the core_throttle_count
file is missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think the performance difference would be noticeable if we removed all of the options? The measurement filtering options can be used to trim out any unwanted data so the main argument in favor would be to reduce the cost of reading the files.
plugins/inputs/cpufreq/README.md
Outdated
## Path for sysfs filesystem. | ||
## See https://www.kernel.org/doc/Documentation/filesystems/sysfs.txt | ||
## Defaults: | ||
# path_sysfs = "/sys" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call this host_sys
and add support for HOST_SYS environment variable which is used only if the option is unset. This is similar to the support in gopsutil.
plugins/inputs/cpufreq/README.md
Outdated
> cpufreq,cpu=5,host=server01 cur_freq=3801758000,max_freq=3900000000,min_freq=800000000 1527789803000000000 | ||
> cpufreq,cpu=6,host=server01 cur_freq=3839194000,max_freq=3900000000,min_freq=800000000 1527789803000000000 | ||
> cpufreq,cpu=7,host=server01 cur_freq=3877989000,max_freq=3900000000,min_freq=800000000 1527789803000000000 | ||
> cpufreq_cpu_throttles,cpu=0,host=server01 count=0 1527789803000000000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's merge this series with cpufreq
since it has the same tagset and represents the same object.
plugins/inputs/cpufreq/README.md
Outdated
> cpufreq_core_throttles,core=0,cpu=0,host=server01 count=0 1527789803000000000 | ||
> cpufreq_core_throttles,core=1,cpu=0,host=server01 count=0 1527789803000000000 | ||
> cpufreq_core_throttles,core=2,cpu=0,host=server01 count=0 1527789803000000000 | ||
> cpufreq_core_throttles,core=3,cpu=0,host=server01 count=0 1527789803000000000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's call this measurement cpufreq_core
and the field can be just throttles. This will allow us to add more per core metrics later on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're contradicting yourself. Your comment right above says to merge cpufreq_core_throttles
with cpufreq
. But this comment says to just rename cpufreq_core_throttles
to cpufreq_core
:-)
I'm in favor of merging both into the cpufreq
measurement. No need for 2 measurements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking there would be two measurements: cpufreq
and cpufreq_core
. cpufreq_core
would have a core
and cpu
tag while cpufreq
would only have cpu
.
If we have this, then it is hard to select just the core throttling because you have to exclude the item without the tag:
cpufreq,cpu=0 throttle=42i,cur_freq=3756293000i
cpufreq,cpu=0,core=0 throttle=42i
This could maybe work, but not sure if I like the naming as much:
cpufreq,cpu=0 throttle=42i,cur_freq=3756293000i
cpufreq,cpu=0,core=0 core_throttle=42i
And here is an example of what I was thinking originally:
cpufreq,cpu=0 throttle=42i,cur_freq=3756293000i
cpufreq_core,cpu=0,core=0 throttle=42i
plugins/inputs/cpufreq/cpufreq.go
Outdated
acc.AddError(err) | ||
continue | ||
} else { | ||
fileds["cur_freq"] = float64(value) * 1000.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these stored as float64
, and not int
?
plugins/inputs/cpufreq/README.md
Outdated
# throttles_per_socket = false | ||
## Gather CPU throttles per physical core | ||
## Defaults: | ||
# throttles_per_core = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I agree with a single option. But if we default to on, the code should be adjusted to not throw an error if the core_throttle_count
file is missing.
plugins/inputs/cpufreq/README.md
Outdated
> cpufreq_core_throttles,core=0,cpu=0,host=server01 count=0 1527789803000000000 | ||
> cpufreq_core_throttles,core=1,cpu=0,host=server01 count=0 1527789803000000000 | ||
> cpufreq_core_throttles,core=2,cpu=0,host=server01 count=0 1527789803000000000 | ||
> cpufreq_core_throttles,core=3,cpu=0,host=server01 count=0 1527789803000000000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're contradicting yourself. Your comment right above says to merge cpufreq_core_throttles
with cpufreq
. But this comment says to just rename cpufreq_core_throttles
to cpufreq_core
:-)
I'm in favor of merging both into the cpufreq
measurement. No need for 2 measurements.
plugins/inputs/cpufreq/cpufreq.go
Outdated
acc.AddFields( | ||
"cpufreq_cpu_throttles", | ||
map[string]interface{}{ | ||
"count": float64(packageThrottleCount), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Repeat of the float64
vs int
question. Why?
Thoughts on including |
Is there anything new on here ? |
I believe we're still waiting for @R4scal to address the feedback |
Is there anything new ? This would be a really useful plugin if available |
If someone wants to pick this up where R4scal left off, let me know and I'll "assign" you to this so the work isn't duplicated. |
1aa3fff
to
e4a5236
Compare
Hi. |
@danielnelson @phemmer ping |
@danielnelson Would love to see this pulled into master. |
Any plans to have this merged? |
Adding another voice of support for this plugin. Hoping it can be merged? In my field (High Performance Computing), we run the CPUs at 100% utilization most of the time and it's very desirable to measure operating frequencies. |
ya, another HPC guy here. |
## Defaults: | ||
# gather_throttles = false | ||
``` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add a Metric section describing the fields similar to #8342!?
> cpufreq,cpu=0,host=server01 cur_freq=3756293000i,max_freq=3900000000i,min_freq=800000000i,throttle_count=0i 1527789803000000000 | ||
> cpufreq,cpu=1,host=server01 cur_freq=3735119000i,max_freq=3900000000i,min_freq=800000000i,throttle_count=0i 1527789803000000000 | ||
> cpufreq,cpu=2,host=server01 cur_freq=3786381000i,max_freq=3900000000i,min_freq=800000000i,throttle_count=0i 1527789803000000000 | ||
> cpufreq,cpu=3,host=server01 cur_freq=3823190000i,max_freq=3900000000i,min_freq=800000000i,throttle_count=0i 1527789803000000000 | ||
> cpufreq,cpu=4,host=server01 cur_freq=3780804000i,max_freq=3900000000i,min_freq=800000000i,throttle_count=0i 1527789803000000000 | ||
> cpufreq,cpu=5,host=server01 cur_freq=3801758000i,max_freq=3900000000i,min_freq=800000000i,throttle_count=0i 1527789803000000000 | ||
> cpufreq,cpu=6,host=server01 cur_freq=3839194000i,max_freq=3900000000i,min_freq=800000000i,throttle_count=0i 1527789803000000000 | ||
> cpufreq,cpu=7,host=server01 cur_freq=3877989000i,max_freq=3900000000i,min_freq=800000000i,throttle_count=0i 1527789803000000000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about renaming cur_freq
, max_freq
, min_freq
to current
, max
, min
? The freq_
part seems superfluous given that the metric is named cpufreq
.
@@ -0,0 +1,116 @@ | |||
package cpufreq |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this only works on linux, please make sure it only builds on linux.
package cpufreq | |
// +build linux | |
package cpufreq |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@R4scal
Moreover, you will need to add cpufreq_notlinux.go
file with following content:
// +build !linux
package cpufreq
To build (but not include in binary) your plugin succesfully for non-Linux systems.
Look at systtat plugin: https://github.com/influxdata/telegraf/tree/f7d94430d29d15c42d8d186c99bfd065b9a21fa0/plugins/inputs/sysstat
} | ||
|
||
func (g *CPUFreq) Description() string { | ||
return "Read specific statistics per cgroup" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably
return "Read specific statistics per cgroup" | |
return "Collect current CPU's frequencies" |
would be better, wouldn't it?
if g.PathSysfs == "" { | ||
if os.Getenv("HOST_SYS") != "" { | ||
g.PathSysfs = os.Getenv("HOST_SYS") | ||
} else { | ||
g.PathSysfs = defaultHostSys | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this to an Init()
function that is called once when the plugin is constructed. This will avoid doing it over and over again.
|
||
cpus, err := filepath.Glob(path.Join(g.PathSysfs, "devices/system/cpu/cpu[0-9]*")) | ||
if err != nil { | ||
acc.AddError(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
filepath.Glob
will only return an error for invalid patterns according to its documentation. Therefore, and because you won't have cpus
anyway, we should simply return err
here.
acc.AddError(err) | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put a `continue here and remove the else saving one indentation level.
if value, err = readUintFromFile(filepath.Join(cpu, "cpufreq", "scaling_cur_freq")); err != nil { | ||
acc.AddError(err) | ||
continue | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for an else
here.
} | ||
if value, err = readUintFromFile(filepath.Join(cpu, "cpufreq", "scaling_min_freq")); err != nil { | ||
acc.AddError(err) | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use continue
as with the previous condition. Maybe it would be nice to avoid the code duplication at all and construct a for
loop running over {"min", "max", "cur"}
.
acc.AddError(err) | ||
continue | ||
} else { | ||
fileds["cur_freq"] = uint64(value) * 1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo fields
instead of fileds
.
if os.Getenv("HOST_SYS") != "" { | ||
g.PathSysfs = os.Getenv("HOST_SYS") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is HOST_SYS
a standard environment variable? I didn't find any documentation about it. Can you point me to a reference?
If it is not, remove this section as an alternative prefix can still be specified in the config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some minor comments. Nothing too big. @R4scal could you please take a look!?
@R4scal any chance you can work on this to resolve the comments? |
Hi. |
Hi.
This plugin work on Linux systems and collect information about cpu frequency and throttles
Required for all PRs: