-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a public registry interface and separate out HTTP exposition #214
Conversation
Oops... will fix test soon. |
// protobufs are consistent and valid for Prometheus to ingest (e.g. no | ||
// duplicate metrics, no invalid identifiers). In scenarios where | ||
// complete collection is critical, the returned MetricFamily protobufs | ||
// should be disregareded if the returned error is non-nil. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
disregarded
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
It's great to finally see this.
It boils down mostly to a question of dependencies. Like Python, Go comes with HTTP support out of the box so having it all in one package is practical. By contrast that's not the case with Java where we need to pull in external dependencies to work. Where we run into issues here is protobuf, as that shouldn't be a dependency of the instrumentation (registry+collector+metrics+http instrumentation) part of the library.
To be honest, we should be replacing most of those uses in favour of the metrics auto-registering by default. What I've done elsewhere is to have the register method on the registry, but also a utility method on the metric for convenience. |
// default registry. The other global functions are not methods in the Registry | ||
// interface but there are utility functions that implement the same | ||
// functionality on top of the interface and take a Registry instance as a | ||
// parameter. Those are: MustRegister → MustRegisterWith, RegisterOrGet → |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a rather big interface, and not following a OOP model by having them as standalone functions rather than methods.
Can we remove the Get methods? The only time I've seen them used is in cases where a ConstMetric should have been used instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is a problem with Go's desire for lean interfaces and at the same time not being a "proper" OOP language. This is essentially a side effect of the absence of implementation inheritance. The standalone functions contain the code that would otherwise be repeated in every implementation of the Registry
interface. While this pattern might appear weird, it's very common in the standard library, e.g. ioutiles.ReadAll(io.Reader)
instead of having a ReadAll
method in the Reader interface.
WRT removing the Get methods: We can definitely consider that. I had a use case, but if it's rare, it can still be implemented by the few users in the same way as it is done here.
However, I don't want to have any avoidable breaking changes in this PR. So that people can have the public registry without breaking changes. Future iterations will be more aggressive WRT breaking existing usage, and then we can consider removal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How many registry implementations are you expecting? We've not seen a need in the other languages, and of all the other instrumentation libraries out there I can think of only one which is even vaguely this general.
I think we can simplify by having Registry as a type rather than an interface.
However, I don't want to have any avoidable breaking changes in this PR. So that people can have the public registry without breaking changes. Future iterations will be more aggressive WRT breaking existing usage, and then we can consider removal.
I'm expecting us to have to leave a lot of cruft lying around for a while for the purposes of transition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having an interface is the "fail safe" approach. Because this library is used quite widely, I assume a lot of requirements. I have observed the recent tendency to go for interfaces in doubt, and may people complaining about std library types that were not done as interfaces.
The normal use case is anyway to use the default registry. So only power user are even concerned with the registry as an own type. So I expect a higher percentage of "weird" use cases there, which we cannot anticipate right now. In any case, we don't have to optimize the registry itself for the naive user, as the latter will use the top-level functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My presumption would be that anything custom could be done without needing an interface. You could accept collectors and then be a standard collector yourself, and go out via a normal Registry to the exposition formats.
I don't at all like all these top-level functions, it feels very disjointed. A user should never have to touch the top-level functions, only instrumentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the number of top-level identifiers, either.
However, bundling them up in classes in a language that doesn't even have proper classes is not really improving anything.
A real solution would be a meaningful re-packaging and of course avoiding many top-level identifiers in the first place. As re-packaging is only an option later (at the price of breaking the world), let's apply extra scrutiny to reducing the number of identifiers.
In that spirit, I'll declare the old ...OrGet
top level function as deprecated, and will not have a utility function for custom registries anymore (will be done in the next push).
Making the exported Registry a struct rather than an interface is a different question. I'll think about it... Other opinions welcome.
In the view of both the server and the Go client, the protobuf is the central data model of Prometheus.
I'm planning something like that for the next iteration. However, as Go has no exceptions, and panicking in a library if not explicitly called out (via
My plan is essentially to have a convenience var (
cpuTemp = prometheus.NewGauge(prometheus.GaugeOpts{
Name: "cpu_temperature_celsius",
Help: "Current temperature of the CPU.",
}).MustRegister()
) But yeah, not really part of this PR. |
// parameter if metrics should be pushed with the hostname as label. The | ||
// returned map is created upon each call so that the caller is free to add more | ||
// labels to the map. | ||
func GroupByHostname() map[string]string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is called instance_ip_grouping_key
and instanceIPGroupingKey
in other clients, we should be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also the IP in other clients, as this is meant to be replacing the old pgw behaviour of adding in the client IP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, definitely.
Question: Did you intentionally pick the IP number rather than the hostname? That needed to be changed here then, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Race condition :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for attaching the IP number was that we had no other information on the Pushgateway.
Picking the IP-number on the side of the client is a bit hard. It's not trivial through with IP-number the push will go out. And IP-numbers might get translated, or they might be long IPv6 numbers...
I'd actually favor the hostname here. (It's a convenience function anyway, and might even not be used a lot.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd expect IPs to be more unique than hostnames, as it's not a FQDN.
I think we should aim for consistency here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How have you picked the IP number in the other clients? The first one that doesn't look like localhost?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Java has methods for it. For Python we bind to a socket and see what the IP is (which turns out not to be terribly portable).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, so the Go stdlibrary allows to list all network interfaces. Picking one seems really adventurous to me. I dislike the IP number idea more and more. The hostname may be pretty arbitrary, too, but at least it's well defined. I like that more than the IP number. But I'm tempted to remove this convenience function completely (it is rarely useful anyway).
Does anybody else have an opinion on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and I strongly prefer hostname over IP.
With mutliple NICs, bonding, docker bridges, etc. there is an endless possibility to pick the wrong IP. On the other side, the hostname has a clear definition and retrieval interface. I've had to write code to retrieve the internal network IP of all our nodes before and it required many exceptions and knowledge about our network to not pick a wrong one. I've seen the same issues with other software trying to auto-pick the right IP (e.g. Consul).
Basically if I'm exporting to Graphite via a Prometheus client library, why should I be pulling in protobuf? This is a really important use case for me, I don't want any friction to users using our instrumentation library.
I don't think it'd be inappropriate to fatal when this happens. We should design the library with the presumption that this is the way things are meant to be used.
That's how Java works, but that's because of how we have to structure the options in Java. In Go we don't have that limitation, so I expect that there'd be no such a method. Rather you'd pass a registry in with the rest of the options, and provide a special no-registry value in case that's needed. |
// the injected protobufs. | ||
// | ||
// The hook function must be callable at any time and concurrently. | ||
func NewRegistryWithInjectionHook(hook func() []*dto.MetricFamily) Registry { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this. This would require a custom registry for the node exporter, which doesn't feel right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a valid concern.
But I really don't want the super-special use-case of the injection hook in the interface.
On the other hand, we probably want error logging in the node exporter anyway, with all the possible things that can go wrong during collection. In that case, you need a custom registry anyway to configure the logger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could adjust the node exporter not to use it, and then remove the feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushgateway uses it, too.
Conclusion for now: NodeExporter will need to use a custom registry anyway (for logging).
So I'll leave it here as it is. Fine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also remove it from the pgw then, and use a collector instead. Given how the node exporter already works it doesn't require a custom registry, just a custom collector.
Needing 2^n
factory functions depending on options doesn't seem like a good idea. We've already two options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On 2 August 2016 at 14:28, Brian Brazil notifications@github.com wrote:
We're creating a new API, it's the time to look any any ways we can simplify that interface. Two use cases that could be implemented in another fashion doesn't justify a public API to me.
Yes, that mirrors an argument we had in the discussion back in early
2013. As said, I thought about changes in the circumstances, but I
couldn't detect any relevant changes, so I don't see a different
outcome of the discussion.
I tried to keep it out of the default path, but as you have brought up yourself, it would require a custom registry for the node exporter.
I never stated it must require a custom registry, merely that the current proposed implementation forces it. I want to continue using use the default registry in the node exporter.
Yes, that's what I meant. By using the default registry in the node
exporter, we need a way to inject metric families with the default
registry. Since the conclusion after a discussion with many
stakeholders was that we need an entry point for that separate from
regular registration and collection, a first-class support in the
interface has to be the way until somebody comes up with a better idea
that changes the odds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That discussion was 3 years ago, lots has happened since then.
The Collect interface is already defined in terms of protos via Metric, converting doesn't seem like it should be a big deal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On 2 August 2016 at 14:56, Brian Brazil notifications@github.com wrote:
That discussion was 3 years ago, lots has happened since then.
The Collect interface is already defined in terms of protos via Metric, converting doesn't seem like it should be a big deal.
To have a meaningful discussion, you have to trust me. I'm not trying
to deceive you.
I considered exactly what you said. While the code back then (as
sketched out by Bernerd) is not line-by-line exactly the same as it
would be today, it would be very similar in essence. We would still
convert a good part of a protobuf object into a different
representation, stuff it into (nowadays) a ConstMetric of the
appropriate type, just to convert everything back then.
Björn Rabenstein, Engineer
http://soundcloud.com/brabenstein
SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany
Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would still convert a good part of a protobuf object into a different
representation, stuff it into (nowadays) a ConstMetric of the
appropriate type, just to convert everything back then.
You can guess what my opinion on how we resolve that is :)
Given the current code, it's not that bad. You need to create a Desc from the MetricFamily, and can then send each Metric down the channel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On 2 August 2016 at 15:29, Brian Brazil notifications@github.com wrote:
Given the current code, it's not that bad. You need to create a Desc from
the MetricFamily, and can then send each Metric down the channel.
Yes, I'm a decent enough programmer to know how I would implement that. It
was considered in my decision.
Björn Rabenstein, Engineer
http://soundcloud.com/brabenstein
SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany
Managing Director: Alexander Ljung | Incorporated in England & Wales with
Company No. 6343600 | Local Branch Office | AG Charlottenburg | HRB 110657B
Because the canonical Prometheus data model is destribed in
Writing a consumer of the new The discussion if |
We are creating a library that aspires to be used in a massive amount of Go programs. I don't think we can work against fundamental conventions in the Go community while doing so.
Yes, that was my initial thought, too, but I couldn't find a way to do this in agreement with Go conventions. But this is another discussion I deliberately didn't want to have now. Let's have it later when I going for the metrics themselves. |
The question here is what dependencies the instrumentation library should have, which is a different question to how we mentally think about Prometheus data. We need to accept that some users have aversions to protobuf (in the same way you or I would about XML), and aversions to seemingly unnecessary dependencies as that's a sign of a bloated library. As you say "We are creating a library that aspires to be used in a massive amount of Go programs." |
// Metric instances “on the fly” using NewConstMetric, NewConstHistogram, and | ||
// NewConstSummary (and their respective Must… versions). That will happen in | ||
// the Collect method, while the Describe method has to return separate Desc | ||
// instances, the very same you later use to create the constant “throw-away” |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit hard to parse
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried to reword.
(Note that this part will definitely be changed in later iterations. No need to touch Desc directly for const metrics.)
To be clear: The question if protobuf should disappear from client_golang is not on the table right now. That discussion has to happen outside of this PR (and in fact outside of the refurbishment I'm planning). |
type HandlerOpts struct { | ||
// ErrorLog specifies an optional logger for errors collecting and | ||
// serving metrics. If nil, errors are not logged at all. | ||
ErrorLog *log.Logger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should take an interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it should.
Sadly, the standard library only offers the log.Logger
type, which is a struct
. I'd love to use an interface like https://github.com/go-kit/kit/blob/master/log/log.go#L10 but I don't think we can coerce users into using a particular logging library. By using the standard library log.Logger
, we allow everybody to use it with any library that provides an adapter, e.g. https://github.com/go-kit/kit/blob/master/log/stdlib.go#L54 .
Full example:
package main
import (
stdlog "log"
"os"
"github.com/go-kit/kit/log"
"github.com/prometheus/client_golang/prometheus"
)
func main() {
logger := log.NewLogfmtLogger(os.Stdout)
handler := prometheus.HandlerFor(prometheus.DefaultRegistry, prometheus.HandlerOpts{
ErrorLog: stdlog.New(log.NewStdlibAdapter(logger), "", 0),
})
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also in line with https://golang.org/pkg/net/http/#Server , i.e. it's the devil people know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Known problem that stdlib should've made it an interface. I think so far a good practice was just adding your own interface with the minimal methods being used, i.e. type Logger interface { Println(...interface{}) error }
.
Allows passing in almost all loggers without relying on go-kit (but can still be used).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And BTW, @brian-brazil , this is an example, where historically, a struct
was chosen over an interface
, and now everybody is sad about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fabxc Since we are only using Println
, that's indeed doable. I'll rework in the next push. (There is the potential for a future problem should we decide to use more methods of the logger. But that's unlikely.)
General context and approch =========================== This is the first part of the long awaited wider refurbishment of `client_golang/prometheus/...`. After a lot of struggling, I decided to not go for one breaking big-bang, but cut things into smaller steps after all, mostly to keep the changes manageable and easy to review. I'm aiming for having the invasive breaking changes concentrated in as few steps as possible (ideally one). Some steps will not be breaking at all, but typically there will be breaking changes that only affect quite special cases so that 95+% of users will not be affected. This first step is an example for that, see details below. What's happening in this commit? ================================ This step is about finally creating an exported registry interface. This could not be done by simply export the existing internal implementation because the interface would be _way_ too fat. This commit introduces a qutie lean `Registry` interface (compared to the previous interval implementation). The functions that act on the default registry are retained (with very few exceptions) so that most use cases won't see a change. However, several of those are deprecated now to clean up the namespace in the future. The default registry is kept in the public variable `DefaultRegistry`. This follows the example of the http package in the standard library (cf. `http.DefaultServeMux`, `http.DefaultClient`) with the same implications. (This pattern is somewhat disputed within the Go community but I chose to go with the devil you know instead of creating something more complex or even disallowing any changes to the default registry. The current approach gives everybody the freedom to not touch DefaultRegistry or to do everything with a custom registry to play save.) Another important part in making the registry lean is the extraction of the HTTP exposition, which also allows for customization of the HTTP exposition. Note that the separation of metric collection and exposition has the side effect that managing the MetricFamily and Metric protobuf objects in a free-list or pool isn't really feasible anymore. By now (with better GC in more recent Go versions), the returns were anyway dimisishing. To be effective at all, scrapes had to happen more often than GC cycles, and even then most elements of the protobufs (everything excetp the MetricFamily and Metric structs themselves) would still cause allocation churn. In a future breaking change, the signature of the Write method in the Metric interface will be adjusted accordingly. In this commit, avoiding breakage is more important. The following issues are fixed by this commit (some solved "on the fly" now that I was touching the code anyway and it would have been stupid to port the bugs): #46 #100 #170 #205 Documentation including examples have been amended as required. What future changes does this commit enable? ============================================ The following items are not yet implemented, but this commit opens the possibility of implementing these independently. - The separation of the HTTP exposition allows the implementation of other exposition methods based on the Registry interface, as known from other Prometheus client libraries, e.g. sending the metrics to Graphite. Cf. #197 - The public `Registry` interface allows the implementation of convenience tools for testing metrics collection. Those tools can inspect the collected MetricFamily protobufs and compare them to expectation. Also, tests can use their own testing instance of a registry. Cf. #58 Notable non-goals of this commit ================================ Non-goals that will be tackled later ------------------------------------ The following two issues are quite closely connected to the changes in this commit but the line has been drawn deliberately to address them in later steps of the refurbishment: - `InstrumentHandler` has many known problems. The plan is to create a saner way to conveniently intrument HTTP handlers and remove the old `InstrumentHandler` altogether. To keep breakage low for now, even the default handler to expose metrics is still using the old `InstrumentHandler`. This leads to weird naming inconsistencies but I have deemed it better to not break the world right now but do it in the change that provides better ways of instrumenting HTTP handlers. Cf. #200 - There is work underway to make the whole handling of metric descriptors (`Desc`) more intuitive and transparent for the user (including an ability for less strict checking, cf. #47). That's quite invasive from the perspective of the internal code, namely the registry. I deliberately kept those changes out of this commit. - While this commit adds new external dependency, the effort to vendor anything within the library that is not visible in any exported types will have to be done later. Non-goals that _might_ be tackled later --------------------------------------- There is a strong and understandable urge to divide the `prometheus` package into a number of sub-packages (like `registry`, `collectors`, `http`, `metrics`, …). However, to not run into a multitude of circular import chains, this would need to break every single existing usage of the library. (As just one example, if the ubiquitious `prometheus.MustRegister` (with more than 2,000 uses on GitHub alone) is kept in the `prometheus` package, but the other registry concerns go into a new `registry` package, then the `prometheus` package would import the `registry` package (to call the actual register method), while at the same time the `registry` package needs to import the `prometheus` package to access `Collector`, `Metric`, `Desc` and more. If we moved `MustRegister` into the `registry` package, thousands of code lines would have to be fixed (which would be easy if the world was a mono repo, but it is not). If we moved everything else the proposed registry package needs into packages of their own, we would break thousands of other code lines.) The main problem is really the top-level functions like `MustRegister`, `Handler`, …, which effectively pull everything into one package. Those functions are however very convenient for the easy and very frequent use-cases. This problem has to be revisited later. For now, I'm trying to keep the amount of exported names in the package as low as possible (e.g. I unexported expvarCollector in this commit because the NewExpvarCollector constructor is enough to export, and it is now consistent with other collectors, like the goCollector). Non-goals that won't be tackled anytime soon -------------------------------------------- Something that I have played with a lot is "streaming collection", i.e. allow an implementation of the `Registry` interface that collects metrics incrementally and serves them while doing so. As it has turned out, this has many many issues and makes the `Registry` interface very clunky. Eventually, I made the call that it is unlikely we will really implement streaming collection; and making the interface more clunky for something that might not even happen is really a big no-no. Note that the `Registry` interface only creates the in-memory representation of the metric family protobufs in one go. The serializaton onto the wire can still be handled in a streaming fashion (which hasn't been done so far, without causing any trouble, but might be done in the future without breaking any interfaces). What are the breaking changes? ============================== - Signatures of functions pushing to Pushgateway have changed to allow arbitrary grouping (which was planned for a long time anyway, and now that I had to work on the Push code anyway for the registry refurbishment, I finally did it, cf. #100). With the gained insight that pushing to the default registry is almost never the right thing, and now that we are breaking the Push call anyway, all the Push functions were moved to their own package, which cleans up the namespace and is more idiomatic (pushing Collectors is now literally done by `push.Collectors(...)`). - The registry is doing more consistency checks by default now. Past creators of inconsistent metrics could have masked the problem by not setting `EnableCollectChecks`. Those inconsistencies will now be detected. (But note that a "best effort" metrics collection is now possible with `HandlerOpts.ErrorHandling = ContinueOnError`.) - `EnableCollectChecks` is gone. The registry is now performing some of those checks anyway (see previous item), and a registry with all of those checks can now be created with `NewPedanticRegistry` (only used for testing). - `PanicOnCollectError` is gone. This behavior can now be configured when creating a custom HTTP handler.
Sooo, several things updated:
That leaves us with really only one function that could be a method on the Registry interface (or struct, see below): A number of other top-level identifiers will disappear with future iterations (like Which leaves us with three remaining open questions (if I haven't missed anything):
Since Brian and I have apparently exchanged all our arguments already, it would be good if others could chime in. Note that I have changed the commit description according to all of this. I'll update the PR description manually soon. The breaking changes are really quite minimal by now. |
So this would fix #216, too. :) |
Minor doc improvement in the latest commit. |
In person discussion with @fabxc @juliusv @brancz resulted in:
I'll change that in two follow-up commits. |
The own implementation is much easier as it only has to serve our one use case.
OK, done. This should settle the fundamentals. |
This looks good to me. I was just coming to this repo to ask for these types to be exported so we don't always have this global registry. |
// the injected protobufs. | ||
// | ||
// The hook function must be callable at any time and concurrently. | ||
func (r *Registry) SetInjectionHook(hook func() []*dto.MetricFamily) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you thought about just using a custom gatherer instead of adding another function to the interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: This is not part of any of the interfaces. It's a use case so special that we don't need to offer that level of abstraction.
For the real question: What do you mean by "custom gatherer"? A gatherer is doing the job of gathering and merging metric families (and return the result). An injection hook is the source of more metric families to merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excuse the ambiguous wording, I was referring to the package interface of client library.
I've never used the injection hook so far, so excuse if I'm missing something, but from looking at the signatures of the current functions, I assumed one could just write a MergeGatherer or so which would call Gather() on two provided gatherer objects (or one gatherer object, one function) and do the merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Yeah, that's actually a neat idea. The injection hook is essentially a gatherer. This sounds so neat that I'm afraid people will abuse it... ;) But it will make the semantics quite clear. I'll play with it. Thanks for the inspiration.
Reviewed 13 of 28 files at r1, 2 of 15 files at r2, 4 of 6 files at r4, 2 of 4 files at r5, 1 of 7 files at r6, 9 of 9 files at r8. Comments from Reviewable |
I implemented @grobie 's idea about the "merge gatherer". That turned out really nice. On the way, I refactored and then improved the consistency checking. |
I'm on vacation with only intermittent internet access. I'd still appreciate feedback about the latest version so that I can work on it when I find time and in particular wrap this up soon after being back. Thanks everybody. |
// uses HTTP method 'POST' to push to the Pushgateway.) | ||
func RegistryAdd(job string, grouping map[string]string, url string, reg prometheus.Gatherer) error { | ||
return push(job, grouping, url, reg, "POST") | ||
// AddFromGatherer works like Gatherer, but only previously pushed metrics with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FromGatherer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Cool! The only thing I don't like are the function names of the push package, but maybe others have some opinions or suggestions here. Otherwise LGTM. |
This allows to finally get rid of the infamous injection hook in the interface. The old SetMetricFamilyInjectionHook still exist as a deprecated function but is now implemented with the new plumbing under the hood. Now that we have multiple Gatherer implementation, I renamed push.Registry to push.FromGatherer. This commit also improves the consistency checks, which happened as a byproduct of the refactoring to allow checking in both the "merge gatherer" Gatherers as well as in the normal Registry.
e3905cd
to
a6321dd
Compare
Thanks, @grobie . I'll wait for ideas and objections from others over the weekend. |
Apparently, we have found the least bad name. ;) Merging now. |
Following prometheus/client_golang#214 the new recommended way of using the prometheus Handler is through the promhttp package
Following prometheus/client_golang#214 the new recommended way of using the prometheus Handler is through the promhttp package
@grobie who might be most keen on this.
@juliusv whom I gave a short preview in person.
But then probably everybody else is interested, @fabxc , @brian-brazil , …
General context and approch
This is the first part of the long awaited wider refurbishment of
client_golang/prometheus/...
. After a lot of struggling, I decidedto not go for one breaking big-bang, but cut things into smaller steps
after all, mostly to keep the changes manageable and easy to
review. I'm aiming for having the invasive breaking changes
concentrated in as few steps as possible (ideally one). Some steps
will not be breaking at all, but typically there will be breaking
changes that only affect quite special cases so that 95+% of users
will not be affected. This first step is an example for that, see
details below.
What's happening in this commit?
This step is about finally creating an exported registry
interface. This could not be done by simply export the existing
internal implementation because the interface would be way too
fat. This commit introduces a qutie lean
Registry
interface(compared to the previous interval implementation). The functions that
act on the default registry are retained (with very few exceptions) so
that most use cases won't see a change. However, several of those are
deprecated now to clean up the namespace in the future.
The default registry is kept in the public variable
DefaultRegistry
. This follows the example of the http package in thestandard library (cf.
http.DefaultServeMux
,http.DefaultClient
)with the same implications. (This pattern is somewhat disputed within
the Go community but I chose to go with the devil you know instead of
creating something more complex or even disallowing any changes to the
default registry. The current approach gives everybody the freedom to
not touch DefaultRegistry or to do everything with a custom registry
to play save.)
Another important part in making the registry lean is the extraction
of the HTTP exposition, which also allows for customization of the
HTTP exposition. Note that the separation of metric collection and
exposition has the side effect that managing the MetricFamily and
Metric protobuf objects in a free-list or pool isn't really feasible
anymore. By now (with better GC in more recent Go versions), the
returns were anyway dimisishing. To be effective at all, scrapes had
to happen more often than GC cycles, and even then most elements of
the protobufs (everything excetp the MetricFamily and Metric structs
themselves) would still cause allocation churn. In a future breaking
change, the signature of the Write method in the Metric interface will
be adjusted accordingly. In this commit, avoiding breakage is more
important.
The following issues are fixed by this commit (some solved "on the
fly" now that I was touching the code anyway and it would have been
stupid to port the bugs):
#46
#100
#170
#205
Documentation including examples have been amended as required.
What future changes does this commit enable?
The following items are not yet implemented, but this commit opens the
possibility of implementing these independently.
other exposition methods based on the Registry interface, as known
from other Prometheus client libraries, e.g. sending the metrics to
Graphite.
Cf. Provide graphite bridge #197
Registry
interface allows the implementation ofconvenience tools for testing metrics collection. Those tools can
inspect the collected MetricFamily protobufs and compare them to
expectation. Also, tests can use their own testing instance of a
registry.
Cf. Improve testability #58
Notable non-goals of this commit
Non-goals that will be tackled later
The following two issues are quite closely connected to the changes in
this commit but the line has been drawn deliberately to address them
in later steps of the refurbishment:
InstrumentHandler
has many known problems. The plan is to create asaner way to conveniently intrument HTTP handlers and remove the old
InstrumentHandler
altogether. To keep breakage low for now, eventhe default handler to expose metrics is still using the old
InstrumentHandler
. This leads to weird naming inconsistencies butI have deemed it better to not break the world right now but do it
in the change that provides better ways of instrumenting HTTP
handlers.
Cf. Race with httputil.ReverseProxy #200
descriptors (
Desc
) more intuitive and transparent for the user(including an ability for less strict checking,
cf. Allow more flexible metric exposition. #47). That's
quite invasive from the perspective of the internal code, namely the
registry. I deliberately kept those changes out of this commit.
anything within the library that is not visible in any exported
types will have to be done later.
Non-goals that might be tackled later
There is a strong and understandable urge to divide the
prometheus
package into a number of sub-packages (like
registry
,collectors
,http
,metrics
, …). However, to not run into a multitude ofcircular import chains, this would need to break every single existing
usage of the library. (As just one example, if the ubiquitious
prometheus.MustRegister
(with more than 2,000 uses on GitHub alone)is kept in the
prometheus
package, but the other registry concernsgo into a new
registry
package, then theprometheus
package wouldimport the
registry
package (to call the actual register method),while at the same time the
registry
package needs to import theprometheus
package to accessCollector
,Metric
,Desc
andmore. If we moved
MustRegister
into theregistry
package,thousands of code lines would have to be fixed (which would be easy if
the world was a mono repo, but it is not). If we moved everything else
the proposed registry package needs into packages of their own, we
would break thousands of other code lines.)
The main problem is really the top-level functions like
MustRegister
,Handler
, …, which effectively pull everything intoone package. Those functions are however very convenient for the easy
and very frequent use-cases.
This problem has to be revisited later.
For now, I'm trying to keep the amount of exported names in the
package as low as possible (e.g. I unexported expvarCollector in this
commit because the NewExpvarCollector constructor is enough to export,
and it is now consistent with other collectors, like the goCollector).
Non-goals that won't be tackled anytime soon
Something that I have played with a lot is "streaming collection",
i.e. allow an implementation of the
Registry
interface that collectsmetrics incrementally and serves them while doing so. As it has turned
out, this has many many issues and makes the
Registry
interface veryclunky. Eventually, I made the call that it is unlikely we will really
implement streaming collection; and making the interface more clunky
for something that might not even happen is really a big no-no. Note
that the
Registry
interface only creates the in-memoryrepresentation of the metric family protobufs in one go. The
serializaton onto the wire can still be handled in a streaming fashion
(which hasn't been done so far, without causing any trouble, but might
be done in the future without breaking any interfaces).
What are the breaking changes?
arbitrary grouping (which was planned for a long time anyway, and
now that I had to work on the Push code anyway for the registry
refurbishment, I finally did it,
cf. Change signature for functions pushing to pushgateway. #100).
With the gained insight that pushing to the default registry is almost
never the right thing, and now that we are breaking the Push call
anyway, all the Push functions were moved to their own package,
which cleans up the namespace and is more idiomatic (pushing
Collectors is now literally done by
push.Collectors(...)
).creators of inconsistent metrics could have masked the problem by
not setting
EnableCollectChecks
. Those inconsistencies will now bedetected. (But note that a "best effort" metrics collection is now
possible with
HandlerOpts.ErrorHandling = ContinueOnError
.)EnableCollectChecks
is gone. The registry is now performing someof those checks anyway (see previous item), and a registry with all
of those checks can now be created with
NewPedanticRegistry
(onlyused for testing).
PanicOnCollectError
is gone. This behavior can now be configuredwhen creating a custom HTTP handler.
This change is