-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Prometheus remote read and write API. #8784
Conversation
Adds a new package prometheus for converting from remote reads and writes to Influx queries and points. Adds two new endpoints to the httpd handler to support prometheus remote read at /api/v1/prom/read and remote write at /api/v1/prom/write.
Just curious, why add this here instead of as a telegraf input? |
@phemmer It's just another input method kind of like we support OpenTSDB, Carbon, and CollectD. It's writing data directly to the DB, where Telegraf is more of a collector. Although admittedly Telegraf has gained more than that over time. Which listener inputs were you thinking of in Telegraf? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good to me @pauldix 👍 . I just left some suggestions, but nothing significant to the workings of the PR.
@@ -0,0 +1,170 @@ | |||
package prometheus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should be under services/prometheus
to match the location of other packages such as opentsdb
, graphite
etc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe? I didn't put it under there because there wasn't an associated service running. I think of things under services
as things that have network listeners.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point.
prometheus/converters.go
Outdated
FieldName = "f64" | ||
) | ||
|
||
var ErrNaNDropped = errors.New("") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a non-empty error string would be more helpful 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in a2d7024
prometheus/converters.go
Outdated
// WriteRequestToPoints converts a Prometheus remote write request of time series and their | ||
// samples into Points that can be written into Influx | ||
func WriteRequestToPoints(req *remote.WriteRequest) ([]models.Point, error) { | ||
var points []models.Point |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe worth pre-allocating points
? Since it's simple to get the total number of samples you could do:
var maxPoints int
for _, ts := range req.Timeseries {
for _, s := range ts.Samples {
maxPoints += len(s.Samples)
}
}
points := make([]models.Point, 0, maxPoints)
prometheus/converters.go
Outdated
|
||
const ( | ||
// MeasurementName is where all prometheus time series go to | ||
MeasurementName = "_" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this be needed outside of the package? Could it be unexported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably :)
prometheus/converters.go
Outdated
Database: db, | ||
RetentionPolicy: rp, | ||
}}, | ||
Dimensions: []*influxql.Dimension{{Expr: &influxql.Wildcard{}}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't a Wildcard
expression only used when doing a SELECT *
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also used when doing a GROUP BY *
, which this needs to do.
services/httpd/handler.go
Outdated
} | ||
|
||
for _, v := range s.Values { | ||
timestamp := v[0].(time.Time).UnixNano() / int64(time.Millisecond) / int64(time.Nanosecond) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s.Values
is a type from the prometheus package right? If so, I would be inclined to avoid the unsafe type assertion on elements in Values
. Only in case a third party bug causes v[0]
to not be a time.Time
(which would cause a panic).
I'd use ts, ok := v[0].(time.Time)
I think and check ok
.
One other thing—time.Nanosecond == 1
. You don't need to divide by it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call on the check. On the divide by time.Nanosecond
, I did that even though it's a 1 because apparently those underlying things aren't guaranteed. Based on the pedantic conversation in the answers of this post: https://stackoverflow.com/questions/24122821/go-golang-time-now-unixnano-convert-to-milliseconds
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in f0ffe20
services/httpd/handler.go
Outdated
timestamp := v[0].(time.Time).UnixNano() / int64(time.Millisecond) / int64(time.Nanosecond) | ||
ts.Samples = append(ts.Samples, &remote.Sample{ | ||
TimestampMs: timestamp, | ||
Value: v[1].(float64), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar concern here with v[1].(float64)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in f0ffe20
services/httpd/handler.go
Outdated
|
||
data, err := proto.Marshal(resp) | ||
if err != nil { | ||
http.Error(w, err.Error(), http.StatusInternalServerError) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
h.httpError
services/httpd/handler.go
Outdated
|
||
compressed = snappy.Encode(nil, data) | ||
if _, err := w.Write(compressed); err != nil { | ||
http.Error(w, err.Error(), http.StatusInternalServerError) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
h.httpError
services/httpd/handler_test.go
Outdated
h.PointsWriter.WritePointsFn = func(db, rp string, _ models.ConsistencyLevel, _ meta.User, points []models.Point) error { | ||
called = true | ||
point := points[0] | ||
if point.UnixNano() != 1*int64(time.Millisecond) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1*
is redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, was because that was the timestamp that was passed in, but will remove
// servePromRead will convert a Prometheus remote read request into an InfluxQL query and | ||
// return data in Prometheus remote read protobuf format. | ||
func (h *Handler) servePromRead(w http.ResponseWriter, r *http.Request, user meta.User) { | ||
compressed, err := ioutil.ReadAll(r.Body) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it appropriate to use h.httpError
here? That will output the error in a format for InfluxDB. While at the moment we use JSON, it could be any format like CSV. If we are attempting to send an error back that matches what Prometheus is expecting, it probably isn't correct to use h.httpError
.
} | ||
|
||
// Execute query. | ||
results := h.QueryExecutor.ExecuteQuery(q, opts, closing) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This entire section here should probably be refactored because it is about 95% in common with serveQuery
, but that doesn't have to be done now. I think #8725 makes it a lot easier.
So no action on this, but I just wanted to get it on the record.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the same is true of the servePromWrite
method, but I wanted to get this in without modifying any of the existing code paths to make it a very low risk merge.
prometheus/converters.go
Outdated
} | ||
promQuery := req.Queries[0] | ||
|
||
q := &influxql.Query{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would probably be easier to just create this at the bottom when you need it with a literal. The literal will likely allocate less memory to boot since it doesn't use append
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pauldix, really just a nit, but given our only dependency is the .proto definition file, we could pull only the remote.proto file in. This would allow us to generate our own .pb.go using gogo, which is more efficient and would avoid pulling entire prometheus repository.
The only thing used from Prometheus was the storage/remote files that are generated from the remote.proto file. Copied that file into promtheus package and removed the dependency.
@stuartcarnie @e-dard @jsternberg ok, removed the prometheus dependency and just copied the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍🏻
prometheus/remote.proto
Outdated
|
||
syntax = "proto3"; | ||
|
||
package prometheus; // change package from remote to prometheus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we put this file in an internal package named remote? I do not want us doing any manual modifications to this file if at all possible. Something like this should work:
prometheus/
internal/
remote/
remote.proto
And then modify the go generate
command to output the remote.pb.go
file to this internal directory. Since it's in the remote folder, the package name will still be remote
and it will be accessible to the prometheus
package, but it won't be accessible to anyone else. That will make updating that file easier when we need to because there are no modifications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the internal
before? Why not just put it in remote
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It means it's not an interface we expose. While not necessary, I think it would be a good habit for us to start getting into.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That feels odd to me. Go already has the public vs. private in a package. If we go down this route we'll end up having internal
in every package in the project, no?
prometheus/converters.go
Outdated
"github.com/influxdata/influxdb/models" | ||
) | ||
|
||
//go:generate protoc -I$GOPATH/src -I. --gogofaster_out=. remote.proto |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In influxql, we just use --gogo_out
. Are those different generators and is this adding a new dependency? We likely should avoid using multiple protoc generators if we can. Also, is the -I
needed? I don't see any included dependencies in the proto file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gogofaster
is a superset of functionality of gogo
with additional optimizations in the generated code and less pointer fields. This generally makes the structs easier and more efficient to work with in Go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The protobuf wire format is unchanged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be using it instead of gogo
everywhere and avoid having multiple generators? I'm not opposed to switching generators, I would just like it if we used one. It also doesn't have to be this PR that switches everything to a new generator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree that we should be consistent and should move to the new generators. In many cases, we will need to do a significant amount of refactoring but the effort will be worth while for performance and ease of use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make sure the other part of the comment isn't lost, is the -I
needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, so use the vanilla generator for now and we can evaluate gogofaster later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go with gogo_faster
. We are using it for the new storage APIs and should plan on migrating others over. It does not introduce additional dependencies. The -I
includes mean that if future versions of the .proto file import other proto files, you won't have to update the generate
line, so it is safe to leave it in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, should be all moved around and good now
Adds a new package prometheus for converting from remote reads and writes to Influx queries and points. Adds two new endpoints to the httpd handler to support prometheus remote read at /api/v1/prom/read and remote write at /api/v1/prom/write.
I've tested just against a local Prometheus server. Will have to prove it out over a large installation so we'd probably want to release it as a beta feature.
I wanted to get this up for review for the approach. Could add unit tests to the
prometheus
package, but the handler test should hit the main points. Was thinking that I may just hold this and base the query stuff off what @stuartcarnie is doing for the v2 read API. It'll have a similar structure as Prometheus to support the v2 data model.