Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CONVENTIONS: Update CPU query sum_irate #988

Merged
merged 1 commit into from
Jan 14, 2022

Conversation

wking
Copy link
Member

@wking wking commented Dec 15, 2021

Catching up with kubernetes-monitoring/kubernetes-mixin#619, which landed in OpenShift 4.9 and later here.

# CPU usage of each container in the openshift-monitoring namespace
max by (pod, container) (node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{namespace="openshift-monitoring"})
max by (pod, container) (node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{namespace="openshift-monitoring"})
Copy link
Member Author

@wking wking Dec 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More broadly, trying to mix openshift-monitoring and openshift-sdn results doesn't make sense to me. Perhaps this was intended to be commented out as an example of changing namespaces and dropping over-time aggregation? I'd expect something like:

sort_desc(
  # Calculate the 90th percentile of CPU usage over the past hour and add 10% to that 
  1.1 * (max by (pod, container) (
    quantile_over_time(0.9, node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{namespace=~"openshift-.*", container != "POD", container!=""}[60m]))
  ) /
  # Calculate the maximum requested CPU per pod and container
  max by (pod, container) (kube_pod_container_resource_requests{namespace=~"openshift-.*", resource="cpu", container!="", container!="POD"}) 
)

Or, if folks don't want to weight for bursts, dropping to avg_over_time:

sort_desc(
  # Calculate the average CPU usage over the past hour
  (avg by (pod, container) (
    avg_over_time(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{namespace=~"openshift-.*", container != "POD", container!=""}[60m]))
  ) /
  # Calculate the maximum requested CPU per pod and container
  max by (pod, container) (kube_pod_container_resource_requests{namespace=~"openshift-.*", resource="cpu", container!="", container!="POD"}) 
)

@openshift-bot
Copy link

Inactive enhancement proposals go stale after 28d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle stale.
Stale proposals rot after an additional 7d of inactivity and eventually close.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 12, 2022
@dhellmann
Copy link
Contributor

/remove-lifecycle stale
/approve

@openshift-ci openshift-ci bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 14, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 14, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dhellmann

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 14, 2022
@philipgough
Copy link

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jan 14, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 14, 2022

@wking: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit fd603fd into openshift:master Jan 14, 2022
@wking wking deleted the fix-rate-to-irate branch January 14, 2022 21:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants