-
Notifications
You must be signed in to change notification settings - Fork 748
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Counters reported as Gauges in Prometheus metrics #3031
Comments
@danielgblanco - could you verify this with the latest version of VPC CNI. This was fixed in the recent versions of CNI 1.18.3. |
This is still present as Gauge in master We need to fix this to use as Counter. |
Sorry I've been on PTO, thanks for the follow up. |
Looks like its not just the AddIPCnt metric that changed incorrectly to a gauge. I think this was done incorrectly in 2ac9e0a#diff-6c65a620b5206565cbd61b3390a33e02146dac52f62735909564c6b963127968L182 |
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days |
Issue closed due to inactivity. |
What happened:
Some of the Prometheus metrics exported by the VPC CNI plugin are defined with inaccurate metric types. For example:
amazon-vpc-cni-k8s/utils/prometheusmetrics/prometheusmetrics.go
Line 64 in 27ce136
This metric (
awscni_add_ip_req_count
) is exported as a gauge but it has cumulative incremental values. In fact, it seems that it's used as a counter in:amazon-vpc-cni-k8s/pkg/ipamd/rpc_handler.go
Line 70 in 27ce136
It seems that
awscni_del_ip_req_count
is correctly exported as a counter.I probably don't have enough context on this to make a judgement call. However, I think there are probably more Gauges that are operating as Counters.
Attach logs
N/A
What you expected to happen:
I'd expect metrics to follow the semantic conventions defined in https://prometheus.io/docs/concepts/metric_types/
How to reproduce it (as minimally and precisely as possible):
Using Prometheus exporters.
Anything else we need to know?:
This may not be a critical issues if systems use Prometheus as the backend. However, it becomes a problem when Prometheus metrics are transformed into other representations. For example, OpenTelemetry Collectors will read this as a Gauge and that gives the aggregation a different meaning (e.g. one can change temporality of counters from cumulative to delta or viceversa).
Environment:
kubectl version
): 1.28.12cat /etc/os-release
): Bottlerocket 1.21.0uname -a
): x86_64 GNU/LinuxThe text was updated successfully, but these errors were encountered: