-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
decouple workqueue metrics from prometheus #33792
decouple workqueue metrics from prometheus #33792
Conversation
Jenkins GCI GCE e2e failed for commit 1e8c071e1910525b6651d888f876a4bda9283ac4. Full PR test history. The magic incantation to run this job again is |
The new public types you're introducing lack documentation, so it's hard to try to infer what they're supposed to be. Like It would be a lot simpler to just have a private interface variable that defaulted to a no-op, and a public registration function. Other than the registration function, and the interface's methods, everything else should be private. e.g. var metricsProvider MetricsProvider = &noopProvider{}
func SetProvider(p MetricsProvider) { // set once } It also seems like the recording of queue and processing start times are being conflated with the metrics provider here. They should be separated out because one is how you calculate metrics, the other is how you expose and record them. |
Sorry for the confusion, I'll add the comments.
I thought of a better pattern, similar to what you asked for. I'll update the code. |
1e8c071
to
21b46bf
Compare
@krousey I changed the pattern to similar to what you suggested. I also added comments. Hopefully it's easier to read now.
The new version should have addressed this comment, if I understood you correctly. |
Jenkins verification failed for commit 21b46bf. Full PR test history. The magic incantation to run this job again is |
type queueMetrics interface { | ||
add(item t) | ||
get(item t) | ||
done(item t) | ||
} | ||
|
||
// GaugeMetric that represents a single numerical value that can |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// GaugeMetric represents ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
type queueMetrics interface { | ||
add(item t) | ||
get(item t) | ||
done(item t) | ||
} | ||
|
||
// GaugeMetric that represents a single numerical value that can | ||
// arbitrarily go up and down. | ||
// Its methods are a subset of prometheus.Gauge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to call out prometheus here. Other implementations could exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
) | ||
|
||
// You need to import the prometheus package to create and register prometheus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe make this more generic and not prometheus specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Dec() | ||
} | ||
|
||
type noopGaugeMetric struct{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your no-ops can be condensed to one type:
type noopMetric struct{}
func (noopMetric) Inc() {}
func (noopMetric) Dec() {}
func (noopMetric) Observe(float64) {}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
return ret | ||
} | ||
// DepthMetricProvider creates DepthMetric. | ||
type DepthMetricProvider interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should be condensed to the types of metrics they provide, not what the metrics are used for. i.e.
GaugeMetricProvider
ConterMetricProvider
SummaryMetricProvider
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I condensed them all to a MetricsProvider interface.
addTimes: map[t]time.Time{}, | ||
processingStartTimes: map[t]time.Time{}, | ||
} | ||
type noopDepthMetricProvider struct{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These noop providers can also be condensed to one type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
// prometheus metrics. To use this package, you just have to import it. | ||
|
||
func init() { | ||
workqueue.DefaultMetricsFactory.SetProviders( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling a method on a type whose definition is private to another package. Not that it won't work, it just feels wrong. Like if I wanted to search godoc.org for the documentation on SetProviders
, I wouldn't find it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation.
@@ -151,3 +188,63 @@ func (m *defaultRetryMetrics) retry() { | |||
|
|||
m.retries.Inc() | |||
} | |||
|
|||
type metricsFactory struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might be easiest for me to just show what I meant by not needing a factory variable exposed.
var metricsFactory = struct {
counterMetricProvider CounterMetricProvider
gaugeMetricProvider GaugeMetricProvider
summaryMetricProvider SummaryMetricProvider
setProviders sync.Once
}{
counterMetricProvider: noopProvider{},
gaugeMetricProvider: noopProvider{},
summaryMetricProvider: noopProvider{},
}
func newQueueMetrics(name string) queueMetrics {
var ret *defaultQueueMetrics
if len(name) == 0 {
return ret
}
return &defaultQueueMetrics{
depth: metricsFactory.gaugeMetricProvider.NewGuageMetric(name, "depth"),
adds: metricsFactory.counterMetricProvider.NewCounterMetric(name, "adds"),
latency: metricsFactory.summaryMetricProvider.NewSummaryMetric(name, "latency"),
workDuration: metricsFactory.summaryMetricProvider.NewSummaryMetric(name, "duration"),
addTimes: map[t]time.Time{},
processingStartTimes: map[t]time.Time{},
}
}
func newRetryMetrics(name string) retryMetrics {
var ret *defaultRetryMetrics
if len(name) == 0 {
return ret
}
return &defaultRetryMetrics{
retries: metricsFactory.counterMetricProvider.NewCounterMetric(name, "retries"),
}
}
// SetProviders sets the metrics providers of the metricsFactory.
func SetProviders(
counterMetricProvider CounterMetricProvider,
gaugeMetricProvider GaugeMetricProvider,
summaryMetricProvider SummaryMetricProvider,
) {
metricsFactory.setProviders.Do(func() {
metricsFactory.counterMetricProvider = counterMetricProvider
metricsFactory.gaugeMetricProvider = gaugeMetricProvider
metricsFactory.summaryMetricProvider = summaryMetricProvider
})
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
) | ||
} | ||
|
||
type prometheusDepthMetricProvider struct{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These can also be condensed into one struct with multiple methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -43,6 +43,7 @@ import ( | |||
"k8s.io/kubernetes/pkg/util/flowcontrol" | |||
"k8s.io/kubernetes/pkg/util/wait" | |||
"k8s.io/kubernetes/pkg/util/workqueue" | |||
_ "k8s.io/kubernetes/pkg/util/workqueue/prometheus" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you try to put this import in package main instead of in these libraries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new github review UI is confusing. I definitely didn't type "Done" to this comment..
Anyway, do you mean importing it in pkg/controller? I'd rather keep this line next to the workqueue import line for easier tracking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant importing it in the relevant main packages in cmd/... Like the throttle metrics are done. Perhaps we should consolidate these metrics implementations into one package. Just import that one package and you get prometheus everywhere.
Another PR of course.
f4864f1
to
b094d6c
Compare
Jenkins Kubemark GCE e2e failed for commit b094d6c. Full PR test history. The magic incantation to run this job again is |
Jenkins GCI GKE smoke e2e failed for commit b094d6c. Full PR test history. The magic incantation to run this job again is |
Jenkins unit/integration failed for commit b094d6c. Full PR test history. The magic incantation to run this job again is |
@@ -43,6 +43,7 @@ import ( | |||
"k8s.io/kubernetes/pkg/util/flowcontrol" | |||
"k8s.io/kubernetes/pkg/util/wait" | |||
"k8s.io/kubernetes/pkg/util/workqueue" | |||
_ "k8s.io/kubernetes/pkg/util/workqueue/prometheus" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant importing it in the relevant main packages in cmd/... Like the throttle metrics are done. Perhaps we should consolidate these metrics implementations into one package. Just import that one package and you get prometheus everywhere.
Another PR of course.
var metricsFactory = struct { | ||
metricsProvider MetricsProvider | ||
setProviders sync.Once | ||
}{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should initialize metricsProvider
here to noopMetricsProvider
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
b094d6c
to
feb0d1d
Compare
I centralized the imports. I removed the import in the scheduler because it's just using |
Jenkins GCE e2e failed for commit feb0d1d. Full PR test history. The magic incantation to run this job again is |
Jenkins GKE smoke e2e failed for commit feb0d1d. Full PR test history. The magic incantation to run this job again is |
@k8s-bot gce e2e test this |
@krousey can I get the lgtm? Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry... was going through email and this was at the bottom of it... until you bumped it that is
Thanks @krousey. I just added a release note (copied from yours), let me know if you have comments on that. |
Jenkins GCE Node e2e failed for commit feb0d1d. Full PR test history. The magic incantation to run this job again is |
Automatic merge from submit-queue |
What this PR does / why we need it:
We want to include the workqueue in client-go, but do not want to having to import Prometheus. This PR decouples the workqueue from prometheus.
Which issue this PR fixes (optional, in
fixes #<issue number>(, #<issue_number>, ...)
format, will close that issue when PR gets merged): fixes #Partially address #33497
User requested for
workqueue
in client-go: kubernetes/client-go#4 (comment)Special notes for your reviewer:
Release note:
This change is