From 4521829c03ef820ae1686f0cef1373e4221b84b7 Mon Sep 17 00:00:00 2001 From: beorn7 Date: Wed, 20 Jul 2016 17:11:14 +0200 Subject: [PATCH] Create a public registry interface and separate out HTTP exposition MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit General context and approch =========================== This is the first part of the long awaited wider refurbishment of `client_golang/prometheus/...`. After a lot of struggling, I decided to not go for one breaking big-bang, but cut things into smaller steps after all, mostly to keep the changes manageable and easy to review. I'm aiming for having the invasive breaking changes concentrated in as few steps as possible (ideally one). Some steps will not be breaking at all, but typically there will be breaking changes that only affect quite special cases so that 95+% of users will not be affected. This first step is an example for that, see details below. What's happening in this commit? ================================ This step is about finally creating an exported registry interface. This could not be done by simply export the existing internal implementation because the interface would be _way_ too fat. This commit introduces a very lean `Registry` interface. Most of the existing functionality that is not part of that interface is provided by helper functions, not by methods (e.g. `MustRegisterWith`). The functions that act on the default registry are retained (with very few exceptions) so that most use cases won't see a change. The default registry is kept in the public variable `DefaultRegistry`. This follows the example of the http package in the standard library (cf. `http.DefaultServeMux`, `http.DefaultClient`) with the same implications. Another important part in making the registry lean is the extraction of the HTTP exposition, which also allows for customization of the HTTP exposition. The following issues are fixed by this commit (some solved "on the fly" now that I was touching the code anyway and it would have been stupid to port the bugs): https://github.com/prometheus/client_golang/issues/46 https://github.com/prometheus/client_golang/issues/100 https://github.com/prometheus/client_golang/issues/170 https://github.com/prometheus/client_golang/issues/205 What future changes does this commit enable? ============================================ The following items are not yet implemented, but this commit opens the possibility of implementing these independently. - The separation of the HTTP exposition allows the implementation of other exposition methods based on the Registry interface, as known from other Prometheus client libraries, e.g. sending the metrics to Graphite. Cf. https://github.com/prometheus/client_golang/issues/197 - The public `Registry` interface allows the implementation of convenience tools for testing metrics collection. Those tools can inspect the collected MetricFamily protobufs and compare them to expectation. Also, tests can use their own testing instance of a registry. Cf. https://github.com/prometheus/client_golang/issues/58 Notable non-goals of this commit ================================ Non-goals that will be tackled later ------------------------------------ The following two issues are quite closely connected to the changes in this commit but the line has been drawn deliberately to address them in later steps of the refurbishment: - `InstrumentHandler` has many known problems. The plan is to create a saner way to conveniently intrument HTTP handlers and remove the old `InstrumentHandler` altogether. To keep breakage low for now, even the default handler to expose metrics is still using the old `InstrumentHandler`. Cf. https://github.com/prometheus/client_golang/issues/200 - There is work underway to make the whole handling of metric descriptors (`Desc`) more intuitive and transparent for the user (including an ability for less strict checking, cf. https://github.com/prometheus/client_golang/issues/47). That's quite invasive from the perspective of the internal code, namely the registry. I deliberately kept those changes out of this commit. Non-goals that _might_ be tackled later --------------------------------------- There is a strong and understandable urge to divide the `prometheus` package into a number of sub-packages (like `registry`, `collectors`, `http`, `metrics`, …). However, to not run into a multitude of circular import chains, this would need to break every single existing usage of the library. (As just one example, if the ubiquitious `prometheus.MustRegister` (with more than 2,000 uses on GitHub alone) is kept in the `prometheus` package, but the other registry concerns go into a new `registry` package, then the `prometheus` package would import the `registry` package (to call the actual register method), while at the same time the `registry` package needs to import the `prometheus` package to access `Collector`, `Metric`, `Desc` and more. If we moved `MustRegister` into the `registry` package, thousands of code lines would have to be fixed (which would be easy if the world was a mono repo, but it is not).) The main problem is really the top-level functions like `MustRegister`, `Handler`, `Push`, …, which effectively pull everything into one package. Those functions are however very convenient for the easy and very frequent use-cases. This problem has to be revisited later. For now, I'm trying to keep the amount of exported names in the package as low as possible (e.g. I unexported expvarCollector in this commit because the NewExpvarCollector constructor is enough to export, similar to the other exporters). Non-goals that won't be tackled anytime soon -------------------------------------------- Something that I have played with a lot is "streaming collection", i.e. allow an implementation of the `Registry` interface that collects metrics incrementally and serves them while doing so. As it has turned out, this has many many issues and makes the `Registry` interface very clunky. Eventually, I made the call that it is unlikely we will really implement streaming collection, and making the interface more clunky for something that might not even happen is really a big no-no. Note that the `Registry` interface only creates the in-memory representation of the metric family protobufs in one go. The serializaton onto the wire can still be handled in a streaming fashion. What are the breaking changes? ============================== - Signature of functions pushing to Pushgateway has changed to allow arbitrary grouping (long planned anyway, and now that I worked on it anyway, I did it, cf. https://github.com/prometheus/client_golang/issues/100). - `SetMetricFamilyInjectionHook` is gone. A registry with a MetricFamily injection hook has to be created now with `NewRegistryWithInjectionHook`. - `PanicOnCollectError` is gone. This behavior can now be configured when creating a custom HTTP handler. - `EnableCollectChecks` is gone. A registry with those checks can now be created with `NewPedanticRegistry` (it is only ever used to test custom Collectors). --- NOTICE | 5 - prometheus/collector.go | 20 +- prometheus/doc.go | 41 +- prometheus/{expvar.go => expvar_collector.go} | 32 +- ...xpvar_test.go => expvar_collector_test.go} | 0 prometheus/go_collector.go | 2 +- prometheus/http.go | 31 ++ prometheus/process_collector.go | 4 +- prometheus/push.go | 12 +- prometheus/registry.go | 459 +++++++++--------- 10 files changed, 331 insertions(+), 275 deletions(-) rename prometheus/{expvar.go => expvar_collector.go} (81%) rename prometheus/{expvar_test.go => expvar_collector_test.go} (100%) diff --git a/NOTICE b/NOTICE index 37e4a7d41..dd878a30e 100644 --- a/NOTICE +++ b/NOTICE @@ -7,11 +7,6 @@ SoundCloud Ltd. (http://soundcloud.com/). The following components are included in this product: -goautoneg -http://bitbucket.org/ww/goautoneg -Copyright 2011, Open Knowledge Foundation Ltd. -See README.txt for license details. - perks - a fork of https://github.com/bmizerany/perks https://github.com/beorn7/perks Copyright 2013-2015 Blake Mizerany, Björn Rabenstein diff --git a/prometheus/collector.go b/prometheus/collector.go index c04688009..adc07b172 100644 --- a/prometheus/collector.go +++ b/prometheus/collector.go @@ -37,16 +37,16 @@ type Collector interface { // executing this method, it must send an invalid descriptor (created // with NewInvalidDesc) to signal the error to the registry. Describe(chan<- *Desc) - // Collect is called by Prometheus when collecting metrics. The - // implementation sends each collected metric via the provided channel - // and returns once the last metric has been sent. The descriptor of - // each sent metric is one of those returned by Describe. Returned - // metrics that share the same descriptor must differ in their variable - // label values. This method may be called concurrently and must - // therefore be implemented in a concurrency safe way. Blocking occurs - // at the expense of total performance of rendering all registered - // metrics. Ideally, Collector implementations support concurrent - // readers. + // Collect is called by the Prometheus registry when collecting + // metrics. The implementation sends each collected metric via the + // provided channel and returns once the last metric has been sent. The + // descriptor of each sent metric is one of those returned by + // Describe. Returned metrics that share the same descriptor must differ + // in their variable label values. This method may be called + // concurrently and must therefore be implemented in a concurrency safe + // way. Blocking occurs at the expense of total performance of rendering + // all registered metrics. Ideally, Collector implementations support + // concurrent readers. Collect(chan<- Metric) } diff --git a/prometheus/doc.go b/prometheus/doc.go index ca56f5ede..5b93c3f9e 100644 --- a/prometheus/doc.go +++ b/prometheus/doc.go @@ -11,18 +11,14 @@ // See the License for the specific language governing permissions and // limitations under the License. -// Package prometheus provides embeddable metric primitives for servers and -// standardized exposition of telemetry through a web services interface. +// Package prometheus provides metrics primitives to instrument code for +// monitoring. It also offers a registry for metrics and ways to expose +// registered metrics via an HTTP endpoint or push them to a Pushgateway. // // All exported functions and methods are safe to be used concurrently unless // specified otherwise. // -// To expose metrics registered with the Prometheus registry, an HTTP server -// needs to know about the Prometheus handler. The usual endpoint is "/metrics". -// -// http.Handle("/metrics", prometheus.Handler()) -// -// As a starting point a very basic usage example: +// As a starting point, a very basic usage example: // // package main // @@ -44,6 +40,7 @@ // ) // // func init() { +// // Metrics have to be registered to be exposed: // prometheus.MustRegister(cpuTemp) // prometheus.MustRegister(hdFailures) // } @@ -52,6 +49,8 @@ // cpuTemp.Set(65.3) // hdFailures.Inc() // +// // The Handler function provides a default handler to expose metrics +// // via an HTTP server. "/metrics" is the usual endpoint for that. // http.Handle("/metrics", prometheus.Handler()) // http.ListenAndServe(":8080", nil) // } @@ -74,8 +73,8 @@ // Those are all the parts needed for basic usage. Detailed documentation and // examples are provided below. // -// Everything else this package offers is essentially for "power users" only. A -// few pointers to "power user features": +// Everything else this package and its sub-packages offer is essentially for +// "power users" only. A few pointers to "power user features": // // All the various ...Opts structs have a ConstLabels field for labels that // never change their value (which is only useful under special circumstances, @@ -84,9 +83,6 @@ // The Untyped metric behaves like a Gauge, but signals the Prometheus server // not to assume anything about its type. // -// Functions to fine-tune how the metric registry works: EnableCollectChecks, -// PanicOnCollectError, Register, Unregister, SetMetricFamilyInjectionHook. -// // For custom metric collection, there are two entry points: Custom Metric // implementations and custom Collector implementations. A Metric is the // fundamental unit in the Prometheus data model: a sample at a point in time @@ -105,7 +101,24 @@ // collection time, MetricVec to bundle custom Metrics into a metric vector // Collector, SelfCollector to make a custom Metric collect itself. // -// A good example for a custom Collector is the ExpVarCollector included in this +// A good example for a custom Collector is the expvarCollector included in this // package, which exports variables exported via the "expvar" package as // Prometheus metrics. +// +// The functions Register, Unregister, MustRegister, RegisterOrGet, and +// MustRegisterOrGet all act on the default registry. They wrap other calls as +// described in their doc comment. For advanced use cases, you can work with +// custom registries (created by NewRegistry and similar) and call the wrapped +// functions directly. +// +// The functions Handler and UninstrumentedHandler create an HTTP handler to +// serve metrics from the default registry in the default way, which covers most +// of the use cases. With HandlerFor, you can create a custom HTTP handler for +// custom registries. +// +// The functions Push and PushAdd push the metrics from the default registry via +// HTTP to a Pushgateway. With PushFrom and PushAddFrom, you can push the +// metrics from custom registries. However, often you just want to push a +// handfull of Collectors only. For that case, there are the convenience +// functions PushCollectors and PushAddCollectors. package prometheus diff --git a/prometheus/expvar.go b/prometheus/expvar_collector.go similarity index 81% rename from prometheus/expvar.go rename to prometheus/expvar_collector.go index 0f7630d53..18a99d5fa 100644 --- a/prometheus/expvar.go +++ b/prometheus/expvar_collector.go @@ -18,21 +18,21 @@ import ( "expvar" ) -// ExpvarCollector collects metrics from the expvar interface. It provides a -// quick way to expose numeric values that are already exported via expvar as -// Prometheus metrics. Note that the data models of expvar and Prometheus are -// fundamentally different, and that the ExpvarCollector is inherently -// slow. Thus, the ExpvarCollector is probably great for experiments and -// prototying, but you should seriously consider a more direct implementation of -// Prometheus metrics for monitoring production systems. -// -// Use NewExpvarCollector to create new instances. -type ExpvarCollector struct { +type expvarCollector struct { exports map[string]*Desc } -// NewExpvarCollector returns a newly allocated ExpvarCollector that still has -// to be registered with the Prometheus registry. +// NewExpvarCollector returns a newly allocated expvar Collector that still has +// to be registered with a Prometheus registry. +// +// An expvar Collector collects metrics from the expvar interface. It provides a +// quick way to expose numeric values that are already exported via expvar as +// Prometheus metrics. Note that the data models of expvar and Prometheus are +// fundamentally different, and that the expvar Collector is inherently slower +// than native Prometheus metrics. Thus, the expvar Collector is probably great +// for experiments and prototying, but you should seriously consider a more +// direct implementation of Prometheus metrics for monitoring production +// systems. // // The exports map has the following meaning: // @@ -59,21 +59,21 @@ type ExpvarCollector struct { // sample values. // // Anything that does not fit into the scheme above is silently ignored. -func NewExpvarCollector(exports map[string]*Desc) *ExpvarCollector { - return &ExpvarCollector{ +func NewExpvarCollector(exports map[string]*Desc) Collector { + return &expvarCollector{ exports: exports, } } // Describe implements Collector. -func (e *ExpvarCollector) Describe(ch chan<- *Desc) { +func (e *expvarCollector) Describe(ch chan<- *Desc) { for _, desc := range e.exports { ch <- desc } } // Collect implements Collector. -func (e *ExpvarCollector) Collect(ch chan<- Metric) { +func (e *expvarCollector) Collect(ch chan<- Metric) { for name, desc := range e.exports { var m Metric expVar := expvar.Get(name) diff --git a/prometheus/expvar_test.go b/prometheus/expvar_collector_test.go similarity index 100% rename from prometheus/expvar_test.go rename to prometheus/expvar_collector_test.go diff --git a/prometheus/go_collector.go b/prometheus/go_collector.go index b0d4fb95c..abc9d4ec4 100644 --- a/prometheus/go_collector.go +++ b/prometheus/go_collector.go @@ -17,7 +17,7 @@ type goCollector struct { // NewGoCollector returns a collector which exports metrics about the current // go process. -func NewGoCollector() *goCollector { +func NewGoCollector() Collector { return &goCollector{ goroutines: NewGauge(GaugeOpts{ Namespace: "go", diff --git a/prometheus/http.go b/prometheus/http.go index e078e3ed1..46c0e5c3d 100644 --- a/prometheus/http.go +++ b/prometheus/http.go @@ -23,6 +23,37 @@ import ( "time" ) +// Handler returns the HTTP handler for the global Prometheus registry. It is +// already instrumented with InstrumentHandler (using "prometheus" as handler +// name). Usually the handler is used to handle the "/metrics" endpoint. +// +// Please note the issues described in the doc comment of InstrumentHandler. You +// might want to consider using UninstrumentedHandler instead. +func Handler() http.Handler { + // TODO + // return InstrumentHandler("prometheus", defRegistry) + return nil +} + +// UninstrumentedHandler works in the same way as Handler, but the returned HTTP +// handler is not instrumented. This is useful if no instrumentation is desired +// (for whatever reason) or if the instrumentation has to happen with a +// different handler name (or with a different instrumentation approach +// altogether). See the InstrumentHandler example. +func UninstrumentedHandler() http.Handler { + // TODO + // return defRegistry + return nil +} + +func HandlerFor(r Registry, opts HandlerOpts) http.Handler { + return nil // TODO +} + +type HandlerOpts struct { + // TODO check how http stdlib is done, error handling, logging +} + var instLabels = []string{"method", "code"} type nower interface { diff --git a/prometheus/process_collector.go b/prometheus/process_collector.go index d8cf0eda3..e31e62e78 100644 --- a/prometheus/process_collector.go +++ b/prometheus/process_collector.go @@ -28,7 +28,7 @@ type processCollector struct { // NewProcessCollector returns a collector which exports the current state of // process metrics including cpu, memory and file descriptor usage as well as // the process start time for the given process id under the given namespace. -func NewProcessCollector(pid int, namespace string) *processCollector { +func NewProcessCollector(pid int, namespace string) Collector { return NewProcessCollectorPIDFn( func() (int, error) { return pid, nil }, namespace, @@ -43,7 +43,7 @@ func NewProcessCollector(pid int, namespace string) *processCollector { func NewProcessCollectorPIDFn( pidFn func() (int, error), namespace string, -) *processCollector { +) Collector { c := processCollector{ pidFn: pidFn, collectFn: func(chan<- Metric) {}, diff --git a/prometheus/push.go b/prometheus/push.go index 5ec0a3ab3..ad6eedc62 100644 --- a/prometheus/push.go +++ b/prometheus/push.go @@ -29,17 +29,25 @@ package prometheus // Note that all previously pushed metrics with the same job and instance will // be replaced with the metrics pushed by this call. (It uses HTTP method 'PUT' // to push to the Pushgateway.) -func Push(job, instance, url string) error { +func Push(job, instance, url string) error { // TODO grouping return defRegistry.Push(job, instance, url, "PUT") } // PushAdd works like Push, but only previously pushed metrics with the same // name (and the same job and instance) will be replaced. (It uses HTTP method // 'POST' to push to the Pushgateway.) -func PushAdd(job, instance, url string) error { +func PushAdd(job, instance, url string) error { // TODO grouping return defRegistry.Push(job, instance, url, "POST") } +func PushFrom(r Registry, grouping ) error { + return nil // TODO +} + +func PushAddFrom(r Registry, grouping ) error { + return nil // TODO +} + // PushCollectors works like Push, but it does not collect from the default // registry. Instead, it collects from the provided collectors. It is a // convenient way to push only a few metrics. diff --git a/prometheus/registry.go b/prometheus/registry.go index f6ae51bed..714ad7665 100644 --- a/prometheus/registry.go +++ b/prometheus/registry.go @@ -38,197 +38,212 @@ import ( dto "github.com/prometheus/client_model/go" ) -var ( - defRegistry = newDefaultRegistry() - errAlreadyReg = errors.New("duplicate metrics collector registration attempted") -) - -// Constants relevant to the HTTP interface. const ( - // APIVersion is the version of the format of the exported data. This - // will match this library's version, which subscribes to the Semantic - // Versioning scheme. - APIVersion = "0.0.4" - - // DelimitedTelemetryContentType is the content type set on telemetry - // data responses in delimited protobuf format. - DelimitedTelemetryContentType = `application/vnd.google.protobuf; proto=io.prometheus.client.MetricFamily; encoding=delimited` - // TextTelemetryContentType is the content type set on telemetry data - // responses in text format. - TextTelemetryContentType = `text/plain; version=` + APIVersion - // ProtoTextTelemetryContentType is the content type set on telemetry - // data responses in protobuf text format. (Only used for debugging.) - ProtoTextTelemetryContentType = `application/vnd.google.protobuf; proto=io.prometheus.client.MetricFamily; encoding=text` - // ProtoCompactTextTelemetryContentType is the content type set on - // telemetry data responses in protobuf compact text format. (Only used - // for debugging.) - ProtoCompactTextTelemetryContentType = `application/vnd.google.protobuf; proto=io.prometheus.client.MetricFamily; encoding=compact-text` - - // Constants for object pools. - numBufs = 4 - numMetricFamilies = 1000 - numMetrics = 10000 - // Capacity for the channel to collect metrics and descriptors. capMetricChan = 1000 capDescChan = 10 +) - contentTypeHeader = "Content-Type" - contentLengthHeader = "Content-Length" - contentEncodingHeader = "Content-Encoding" +// DefaultRegistry is the default registry implicitly used by a number of +// convenience functions. It has a ProcessCollector and a GoCollector +// pre-registered. +var DefaultRegistry = NewRegistry() - acceptEncodingHeader = "Accept-Encoding" - acceptHeader = "Accept" -) +func init() { + MustRegister(NewProcessCollector(os.Getpid(), "")) + MustRegister(NewGoCollector()) +} + +// NewRegistry creates a new vanilla Registry without any Collectors +// pre-registered. +func NewRegistry() Registry { + return ®istry{ + collectorsByID: map[uint64]Collector{}, + descIDs: map[uint64]struct{}{}, + dimHashesByName: map[string]uint64{}, + } +} -// Handler returns the HTTP handler for the global Prometheus registry. It is -// already instrumented with InstrumentHandler (using "prometheus" as handler -// name). Usually the handler is used to handle the "/metrics" endpoint. +// NewRegistryWithInjectionHook creates a registry with the provided hook to inject +// MetricFamilies. The hook is a function that is called whenever metrics are +// collected. The MetricFamily protobufs returned by the hook function are +// merged with the metrics collected in the usual way. // -// Please note the issues described in the doc comment of InstrumentHandler. You -// might want to consider using UninstrumentedHandler instead. -func Handler() http.Handler { - return InstrumentHandler("prometheus", defRegistry) -} - -// UninstrumentedHandler works in the same way as Handler, but the returned HTTP -// handler is not instrumented. This is useful if no instrumentation is desired -// (for whatever reason) or if the instrumentation has to happen with a -// different handler name (or with a different instrumentation approach -// altogether). See the InstrumentHandler example. -func UninstrumentedHandler() http.Handler { - return defRegistry -} - -// Register registers a new Collector to be included in metrics collection. It -// returns an error if the descriptors provided by the Collector are invalid or -// if they - in combination with descriptors of already registered Collectors - -// do not fulfill the consistency and uniqueness criteria described in the Desc -// documentation. +// This is a way to directly inject MetricFamily protobufs managed and owned by +// the caller. The caller has full responsibility. As no registration of the +// injected metrics has happened, there was no check at registration-time. If +// the injection results in inconsistent metrics, the Collect call will return +// an error. +// +// Sorting concerns: The caller is responsible for sorting the label pairs in +// each metric. However, the order of metrics in each MetricFamily and the order +// of MetricFamilies in the returned slice does not matter as those will be +// sorted by the registry. This sorting is required anyway after merging with +// the metric families collected conventionally. +// +// The function must be callable at any time and concurrently. +func NewRegistryWithInjectionHook(hook func() []*dto.MetricFamily) Registry { + r := NewRegistry().(*registry) + r.metricFamilyInjectionHook = hook + return r +} + +// NewPedanticRegistry returns a registry that checks during collection if the +// Descs provided by registered Collectors are consistent with their collected +// Metrics. // -// Do not register the same Collector multiple times concurrently. (Registering -// the same Collector twice would result in an error anyway, but on top of that, -// it is not safe to do so concurrently.) -func Register(m Collector) error { - _, err := defRegistry.Register(m) - return err -} - -// MustRegister works like Register but panics where Register would have -// returned an error. MustRegister is also Variadic, where Register only -// accepts a single Collector to register. -func MustRegister(m ...Collector) { - for i := range m { - if err := Register(m[i]); err != nil { +// Usually, a Registry will be happy as long as the union of all collected +// Metrics is consistent and valid even if some metrics are not consistent with +// any of the Descs provided by their Collector. Well-behaved Collectors will +// only return Metrics consistent with the provided Descs. To test the +// implementation of Collectors, this Registry can be used. +func NewPedanticRegistry() Registry { + r := NewRegistry().(*registry) + r.collectChecksEnabled = true + return r +} + +// Registry is the interface for the metrics registry. +type Registry interface { + // Register registers a new Collector to be included in metrics + // collection. It returns an error if the descriptors provided by the + // Collector are invalid or if they - in combination with descriptors of + // already registered Collectors - do not fulfill the consistency and + // uniqueness criteria described in the documentation of metric.Desc. + // + // If the provided Collector is equal to a Collector already registered + // (which includes the case of re-registering the same Collector), the + // returned error is an instance of AlreadyRegisteredError, which + // contains the previously registered Collector. + // + // It is in general not safe to register the same Collector multiple + // times concurrently. + Register(Collector) error + // Unregister unregisters the Collector that equals the Collector passed + // in as an argument. (Two Collectors are considered equal if their + // Describe method yields the same set of descriptors.) The function + // returns whether a Collector was unregistered. + // + // Note that even after unregistering, it will not be possible to + // register a new Collector that is inconsistent with the unregistered + // Collector, e.g. a Collector collecting metrics with the same name but + // a different help string. The rationale here is that the same registry + // instance must only collect consistent metrics throughout its + // lifetime. + Unregister(Collector) bool + // Collect collects metrics from registered Collectors and returns them + // as lexicographically sorted MetricFamily protobufs. Even if an error + // occurs, Collect attempts to collect as many metrics as + // possible. Hence, if a non-nil error is returned, the returned + // MetricFamily slice could be nil (in case of a fatal error that + // prevented any meaningful metric collection) or contain a number of + // MetricFamily protobufs, some of which might be incomplete, and some + // might be missing altogether. The returned error (which might be a + // multierror.Error) explains the details. In any case, the MetricFamily + // protobufs are consistent and valid for Prometheus to ingest (e.g. no + // duplicate metrics, no invalid identifiers). In scenarios where + // complete collection is critical, the returned MetricFamily protobufs + // should be disregareded if the returned error is non-nil. + Collect() ([]*dto.MetricFamily, error) +} + +// MustRegisterWith registers the provided Collectors with the provided Registry +// and panics upon the first registration that causes an error. +func MustRegisterWith(r Registry, cs ...Collector) { + for _, c := range cs { + if err := r.Register(c); err != nil { panic(err) } } } -// RegisterOrGet works like Register but does not return an error if a Collector -// is registered that equals a previously registered Collector. (Two Collectors -// are considered equal if their Describe method yields the same set of -// descriptors.) Instead, the previously registered Collector is returned (which -// is helpful if the new and previously registered Collectors are equal but not -// identical, i.e. not pointers to the same object). +// RegisterWithOrGet registers the provided Collector with the provided Registry +// but does not return an error if a Collector is registered that equals a +// previously registered Collector. (Two Collectors are considered equal if +// their Describe method yields the same set of descriptors.) Instead, the +// previously registered Collector is returned (which is helpful if the new and +// previously registered Collectors are equal but not identical, i.e. not +// pointers to the same object). // -// As for Register, it is still not safe to call RegisterOrGet with the same -// Collector multiple times concurrently. -func RegisterOrGet(m Collector) (Collector, error) { - return defRegistry.RegisterOrGet(m) +// As for RegisterWith, it is still not safe to call RegisteWithrOrGet with the +// same Collector multiple times concurrently. +func RegisterWithOrGet(r Registry, c Collector) (Collector, error) { + if err := r.Register(c); err != nil { + if are, ok := err.(AlreadyRegisteredError); ok { + return are.ExistingCollector, nil + } + return nil, err + } + return c, nil } -// MustRegisterOrGet works like Register but panics where RegisterOrGet would -// have returned an error. -func MustRegisterOrGet(m Collector) Collector { - existing, err := RegisterOrGet(m) +// MustRegisterWithOrGet works like RegisterWithOrGet but panics where RegisterWithOrGet +// would have returned an error. +func MustRegisterWithOrGet(r Registry, c Collector) Collector { + existing, err := RegisterWithOrGet(r, c) if err != nil { panic(err) } return existing } -// Unregister unregisters the Collector that equals the Collector passed in as -// an argument. (Two Collectors are considered equal if their Describe method -// yields the same set of descriptors.) The function returns whether a Collector -// was unregistered. -func Unregister(c Collector) bool { - return defRegistry.Unregister(c) +// Register is a shortcut for DefaultRegistry.Register(c). +func Register(c Collector) error { + return DefaultRegistry.Register(c) } -// SetMetricFamilyInjectionHook sets a function that is called whenever metrics -// are collected. The hook function must be set before metrics collection begins -// (i.e. call SetMetricFamilyInjectionHook before setting the HTTP handler.) The -// MetricFamily protobufs returned by the hook function are merged with the -// metrics collected in the usual way. -// -// This is a way to directly inject MetricFamily protobufs managed and owned by -// the caller. The caller has full responsibility. As no registration of the -// injected metrics has happened, there is no descriptor to check against, and -// there are no registration-time checks. If collect-time checks are disabled -// (see function EnableCollectChecks), no sanity checks are performed on the -// returned protobufs at all. If collect-checks are enabled, type and uniqueness -// checks are performed, but no further consistency checks (which would require -// knowledge of a metric descriptor). -// -// Sorting concerns: The caller is responsible for sorting the label pairs in -// each metric. However, the order of metrics will be sorted by the registry as -// it is required anyway after merging with the metric families collected -// conventionally. -// -// The function must be callable at any time and concurrently. -func SetMetricFamilyInjectionHook(hook func() []*dto.MetricFamily) { - defRegistry.metricFamilyInjectionHook = hook +// MustRegister is a shortcut for MustRegisterWith(DefaultRegistry, cs...). +func MustRegister(cs ...Collector) { + MustRegisterWith(DefaultRegistry, cs...) } -// PanicOnCollectError sets the behavior whether a panic is caused upon an error -// while metrics are collected and served to the HTTP endpoint. By default, an -// internal server error (status code 500) is served with an error message. -func PanicOnCollectError(b bool) { - defRegistry.panicOnCollectError = b +// RegisterOrGet is a shortcut for RegisterWithOrGet(DefaultRegistry, c). +func RegisterOrGet(c Collector) (Collector, error) { + return RegisterWithOrGet(DefaultRegistry, c) } -// EnableCollectChecks enables (or disables) additional consistency checks -// during metrics collection. These additional checks are not enabled by default -// because they inflict a performance penalty and the errors they check for can -// only happen if the used Metric and Collector types have internal programming -// errors. It can be helpful to enable these checks while working with custom -// Collectors or Metrics whose correctness is not well established yet. -func EnableCollectChecks(b bool) { - defRegistry.collectChecksEnabled = b +// MustRegisterOrGet is a shortcut for MustRegisterWithOrGet(DefaultRegistry, c). +func MustRegisterOrGet(c Collector) Collector { + return MustRegisterWithOrGet(DefaultRegistry, c) } -// encoder is a function that writes a dto.MetricFamily to an io.Writer in a -// certain encoding. It returns the number of bytes written and any error -// encountered. Note that pbutil.WriteDelimited and pbutil.MetricFamilyToText -// are encoders. -type encoder func(io.Writer, *dto.MetricFamily) (int, error) +// Unregister is a shortcut for DefaultRegistry.Unregister(c). +func Unregister(c Collector) bool { + return DefaultRegistry.Unregister(c) +} + +// AlreadyRegisteredError is returned by the "RegisterOrGet" type of +// registration calls. It contains the existing Collector and the (rejected) new +// Collector that equals the existing one. +type AlreadyRegisteredError struct { + ExistingCollector, NewCollector Collector +} + +func (err AlreadyRegisteredError) Error() string { + return "duplicate metrics collector registration attempted" +} type registry struct { mtx sync.RWMutex collectorsByID map[uint64]Collector // ID is a hash of the descIDs. descIDs map[uint64]struct{} dimHashesByName map[string]uint64 - bufPool chan *bytes.Buffer - metricFamilyPool chan *dto.MetricFamily - metricPool chan *dto.Metric metricFamilyInjectionHook func() []*dto.MetricFamily - - panicOnCollectError, collectChecksEnabled bool + collectChecksEnabled bool } -func (r *registry) Register(c Collector) (Collector, error) { - descChan := make(chan *Desc, capDescChan) +func (r *registry) Register(c Collector) error { + var ( + descChan = make(chan *Desc, capDescChan) + newDescIDs = map[uint64]struct{}{} + newDimHashesByName = map[string]uint64{} + collectorID uint64 // Just a sum of all desc IDs. + duplicateDescErr error + ) go func() { c.Describe(descChan) close(descChan) }() - - newDescIDs := map[uint64]struct{}{} - newDimHashesByName := map[string]uint64{} - var collectorID uint64 // Just a sum of all desc IDs. - var duplicateDescErr error - r.mtx.Lock() defer r.mtx.Unlock() // Coduct various tests... @@ -236,7 +251,7 @@ func (r *registry) Register(c Collector) (Collector, error) { // Is the descriptor valid at all? if desc.err != nil { - return c, fmt.Errorf("descriptor %s is invalid: %s", desc, desc.err) + return fmt.Errorf("descriptor %s is invalid: %s", desc, desc.err) } // Is the descID unique? @@ -257,13 +272,13 @@ func (r *registry) Register(c Collector) (Collector, error) { // First check existing descriptors... if dimHash, exists := r.dimHashesByName[desc.fqName]; exists { if dimHash != desc.dimHash { - return nil, fmt.Errorf("a previously registered descriptor with the same fully-qualified name as %s has different label names or a different help string", desc) + return fmt.Errorf("a previously registered descriptor with the same fully-qualified name as %s has different label names or a different help string", desc) } } else { // ...then check the new descriptors already seen. if dimHash, exists := newDimHashesByName[desc.fqName]; exists { if dimHash != desc.dimHash { - return nil, fmt.Errorf("descriptors reported by collector have inconsistent label names or help strings for the same fully-qualified name, offender is %s", desc) + return fmt.Errorf("descriptors reported by collector have inconsistent label names or help strings for the same fully-qualified name, offender is %s", desc) } } else { newDimHashesByName[desc.fqName] = desc.dimHash @@ -272,15 +287,18 @@ func (r *registry) Register(c Collector) (Collector, error) { } // Did anything happen at all? if len(newDescIDs) == 0 { - return nil, errors.New("collector has no descriptors") + return errors.New("collector has no descriptors") } if existing, exists := r.collectorsByID[collectorID]; exists { - return existing, errAlreadyReg + return AlreadyRegisteredError{ + ExistingCollector: existing, + NewCollector: c, + } } // If the collectorID is new, but at least one of the descs existed // before, we are in trouble. if duplicateDescErr != nil { - return nil, duplicateDescErr + return duplicateDescErr } // Only after all tests have passed, actually register. @@ -291,26 +309,19 @@ func (r *registry) Register(c Collector) (Collector, error) { for name, dimHash := range newDimHashesByName { r.dimHashesByName[name] = dimHash } - return c, nil -} - -func (r *registry) RegisterOrGet(m Collector) (Collector, error) { - existing, err := r.Register(m) - if err != nil && err != errAlreadyReg { - return nil, err - } - return existing, nil + return nil } -func (r *registry) Unregister(c Collector) bool { - descChan := make(chan *Desc, capDescChan) +func (r *registry) Unregister(Collector) bool { + var ( + descChan = make(chan *Desc, capDescChan) + descIDs = map[uint64]struct{}{} + collectorID uint64 // Just a sum of the desc IDs. + ) go func() { c.Describe(descChan) close(descChan) }() - - descIDs := map[uint64]struct{}{} - var collectorID uint64 // Just a sum of the desc IDs. for desc := range descChan { if _, exists := descIDs[desc.id]; !exists { collectorID += desc.id @@ -337,6 +348,62 @@ func (r *registry) Unregister(c Collector) bool { return true } +func (r *registry) Collect() ([]*dto.MetricFamily, error) { + return nil, nil // TODO +} + +// metricSorter is a sortable slice of *dto.Metric. +type metricSorter []*dto.Metric + +func (s metricSorter) Len() int { + return len(s) +} + +func (s metricSorter) Swap(i, j int) { + s[i], s[j] = s[j], s[i] +} + +func (s metricSorter) Less(i, j int) bool { + if len(s[i].Label) != len(s[j].Label) { + // This should not happen. The metrics are + // inconsistent. However, we have to deal with the fact, as + // people might use custom collectors or metric family injection + // to create inconsistent metrics. So let's simply compare the + // number of labels in this case. That will still yield + // reproducible sorting. + return len(s[i].Label) < len(s[j].Label) + } + for n, lp := range s[i].Label { + vi := lp.GetValue() + vj := s[j].Label[n].GetValue() + if vi != vj { + return vi < vj + } + } + + // We should never arrive here. Multiple metrics with the same + // label set in the same scrape will lead to undefined ingestion + // behavior. However, as above, we have to provide stable sorting + // here, even for inconsistent metrics. So sort equal metrics + // by their timestamp, with missing timestamps (implying "now") + // coming last. + if s[i].TimestampMs == nil { + return false + } + if s[j].TimestampMs == nil { + return true + } + return s[i].GetTimestampMs() < s[j].GetTimestampMs() +} + +// TODO port from here on. + +// encoder is a function that writes a dto.MetricFamily to an io.Writer in a +// certain encoding. It returns the number of bytes written and any error +// encountered. Note that pbutil.WriteDelimited and pbutil.MetricFamilyToText +// are encoders. +type encoder func(io.Writer, *dto.MetricFamily) (int, error) + func (r *registry) Push(job, instance, pushURL, method string) error { if !strings.Contains(pushURL, "://") { pushURL = "http://" + pushURL @@ -665,21 +732,6 @@ func (r *registry) giveMetric(m *dto.Metric) { } func newRegistry() *registry { - return ®istry{ - collectorsByID: map[uint64]Collector{}, - descIDs: map[uint64]struct{}{}, - dimHashesByName: map[string]uint64{}, - bufPool: make(chan *bytes.Buffer, numBufs), - metricFamilyPool: make(chan *dto.MetricFamily, numMetricFamilies), - metricPool: make(chan *dto.Metric, numMetrics), - } -} - -func newDefaultRegistry() *registry { - r := newRegistry() - r.Register(NewProcessCollector(os.Getpid(), "")) - r.Register(NewGoCollector()) - return r } // decorateWriter wraps a writer to handle gzip compression if requested. It @@ -696,46 +748,3 @@ func decorateWriter(request *http.Request, writer io.Writer) (io.Writer, string) } return writer, "" } - -type metricSorter []*dto.Metric - -func (s metricSorter) Len() int { - return len(s) -} - -func (s metricSorter) Swap(i, j int) { - s[i], s[j] = s[j], s[i] -} - -func (s metricSorter) Less(i, j int) bool { - if len(s[i].Label) != len(s[j].Label) { - // This should not happen. The metrics are - // inconsistent. However, we have to deal with the fact, as - // people might use custom collectors or metric family injection - // to create inconsistent metrics. So let's simply compare the - // number of labels in this case. That will still yield - // reproducible sorting. - return len(s[i].Label) < len(s[j].Label) - } - for n, lp := range s[i].Label { - vi := lp.GetValue() - vj := s[j].Label[n].GetValue() - if vi != vj { - return vi < vj - } - } - - // We should never arrive here. Multiple metrics with the same - // label set in the same scrape will lead to undefined ingestion - // behavior. However, as above, we have to provide stable sorting - // here, even for inconsistent metrics. So sort equal metrics - // by their timestamp, with missing timestamps (implying "now") - // coming last. - if s[i].TimestampMs == nil { - return false - } - if s[j].TimestampMs == nil { - return true - } - return s[i].GetTimestampMs() < s[j].GetTimestampMs() -}