-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make metrics better follow guidelines #787
Conversation
3595e53
to
5fd9726
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
collector/diskstats_linux.go
Outdated
@@ -185,6 +185,11 @@ func (c *diskstatsCollector) Update(ch chan<- prometheus.Metric) error { | |||
if err != nil { | |||
return fmt.Errorf("invalid value %s in diskstats: %s", value, err) | |||
} | |||
// Convert to seconds | |||
switch i { | |||
case 3, 7, 9, 10: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's pretty prone to errors. Would be great to use a value function instead, similar to what is done already in diskstats_darwin.go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
5fd9726
to
380a9ce
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
380a9ce
to
d476e33
Compare
As just discussed there with @SuperQ I have removed the diskstat sector metrics, as that's already covered by the bytes metrics. |
I like the |
Remove sector metrics, the bytes metrics cover those already.
See https://github.com/torvalds/linux/blob/3c073991eb417b6f785ddc6afbbdc369eb84aa6a/kernel/time/ntp.c#L909 for what stabil means, looks like a moving average of some form.
d476e33
to
12287f9
Compare
12287f9
to
9dfb5d1
Compare
I've added in meminfo and interrupts changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! LGTM
Worth throwing |
@mdlayher Sounds like a good idea! Does it pass already with these changes? |
Last time I tried, my PR passed every check except for the metrics exposed by the Prometheus Go client. We could either add an exception, or just fix the go client to use seconds instead of microseconds. (IIRC) |
@@ -54,7 +54,7 @@ func init() { | |||
func NewCPUCollector() (Collector, error) { | |||
return &cpuCollector{ | |||
cpu: prometheus.NewDesc( | |||
prometheus.BuildFQName(namespace, "", cpuCollectorSubsystem), | |||
prometheus.BuildFQName(namespace, cpuCollectorSubsystem, "seconds_total"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't cpu_usage_seconds_total
be the more common naming here rather than just cpu_seconds_total
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, but not everything is used
, what about idle
and steal
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dunno "used" still sounds ok to me even for those :) But I don't care much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/cc @matthiasr
* Improve stat linux metric names. cpu is no longer used. * node_cpu -> node_cpu_seconds_total for Linux * Improve filesystem metric names with units * Improve units and names of linux disk stats Remove sector metrics, the bytes metrics cover those already. * Infiniband counters should end in _total * Improve timex metric names, convert to more normal units. See https://github.com/torvalds/linux/blob/3c073991eb417b6f785ddc6afbbdc369eb84aa6a/kernel/time/ntp.c#L909 for what stabil means, looks like a moving average of some form. * Update test fixture * For meminfo metrics that had "kB" units, add _bytes * Interrupts counter should have _total
I think that's all the main ones, I'm a little unsure on the timex ones even after looking at the docs and source
@SuperQ
Fixes #150
Supercedes #559