-
Notifications
You must be signed in to change notification settings - Fork 808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Emit AWS API operation duration/error/throttle metrics #842
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: wongma7 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/cc @gnufied need your expertise or testimony from SREs relying on this metric : - ) |
/cc @AndyXiangLi BTW, once we have these metrics, we may need to do some basic comparison / load testing with in-tree driver for API call volume. In addition to the pod startup type of testing we are doing. I probably can sign up for it, time permitting : D |
Pull Request Test Coverage Report for Build 1869
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some minor comments. It is okay to rename the metrics as long as - we cover it with release notes. I think whenever AWS migration is enabled by default, the release note should capture the renamed metrics.
also cc @Jiawei0227 and @msau42 who are tracking CSI migration work and metrics migration is a prerequisite for CSI migration. |
/lgtm |
Is this a bug fix or adding new feature? fix #806
What is this PR about? / Why do we need it? This publishes API metrics equivalent to those that the ebs plugin/aws cloud provider embedded in kube-controller-manager publishes today, like describeinstances/describevolume call durations, error counts, and throttle error counts.
I'm not well-versed in how people intend to consume the metrics so this PR is barebones, for now it just exposes them over port 80 using "k8s.io/component-base/metrics/legacyregistry"
DIFFERENCES between my implementation and the cloud provider one:
Complete
handler and refer to request.Operation.Name when emitting metrics. As opposed to wrapping every call and referring to a custom request name.What testing is done?
kubectl port-forward deployment/ebs-csi-controller 8080:80 -n kube-system
excerpt: