-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(prometheus,shred-collector): metrics improvements + add metrics to shred collector #306
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like a great improvement - a few nits - and a few comments that are more global questions of 'how we name/organize metrics' -- which is more than what this PR asked for but i think it has the chance to make a large improvement to consistency if we decide these things here
can we add the comments on metrics to contributing.md?
|
also - should this PR include a corresponding grafana dashboard? doesnt seem to be commited |
9958d04
to
75f0806
Compare
…to shred collector
… of separate metrics
75f0806
to
38c9d9c
Compare
I figured I would do this later when I actually need to gain insights from the metrics. For now I just wanted to expose them in the code. It would be great to also have a dashboard already, but I just didn't prioritize it at this point. |
df884bc
to
9b17bfb
Compare
|
nice - just commited a quick fit to the naming -- some cases updated the struct name but not the field (ie, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The aim of this PR is to improve the observability of the shred collector with prometheus metrics. While doing so, I also made various improvements to the prometheus library. I also made changes throughout the rest of the codebase to take advantage of these improvements.
Shred Collector
Added metrics to:
Prometheus
VariantCounter
I added a new metric type:
VariantCounter
: This separately count the occurrence of each variant within an enum or error set. This is a composite metric, similar to a histogram, except the "buckets" are discrete, unordered, and identified by name, instead of representing numeric ranges. Currently it's only used for counting errors but it can also be used to count enum variants, since enums and errors are conceptually almost the same thing.I was struggling to decide if each error should be reported as a completely separate metric, or if the VariantCounter as a whole should be a single metric with many labels. I ended up decided to take the label approach because I think it's more flexible, but I'd appreciate feedback on this. See this commit for switching to labels: 0afc750
Initiailize structs containing many metrics
We have some structs like this:
Usually we define a custom init function for each of these structs. These init functions use comptime code to initialize all the fields. I moved all this logic into a single place. It is now in these methods in the Registry struct:
initStruct
: Init an entire metrics struct that contains many metricsinitFields
: Init metrics fields within a struct that has other dataConfiguration:
Metric.metric_type
This is provided within the definitions for Counter, Gauge, etc. to identify the type of metric soinitStruct
andinitFields
can work properly.MyMetricsStruct.prefix
optional: For namespacing of metrics. Within the struct containing metrics, this will string will be prefixed to any metric namesMyMetricsStruct.buckets
required for histograms. If the struct contains any histograms, you need to specify the buckets either with a const or a function that takes the field name and returns its buckets.Existing metrics tweaks
Counter
: removeset
methodGauge
: add min and max methods