You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanos, Prometheus and Golang version used
Replicated on v0.3.2, v0.4.0, v0.6.0, and v0.6.1
What happened
Store gets an unexpected EOF when trying to read in a chunk because the Azure library uses an io.ReadFull call with an ending point that is longer than the file.
This error does not get passed up the stack by oklog/run when the group is run in func (s *BucketStore) Series(req *storepb.SeriesRequest, srv storepb.Store_SeriesServer) (err error), which made this more fun to debug 😄
What you expected to happen
Either Thanos to not request an ending point that is larger than the file, or the Azure library to handle it gracefully.
I don't think Thanos knows ahead of time how big the chunk is, so I'm leaning towards Azure's fault. This leads me to think that:
parts := r.block.partitioner.Partition(len(offsets), func(i int) (start, end uint64) {
return uint64(offsets[i]), uint64(offsets[i]) + maxChunkSize
})
How to reproduce it (as minimally and precisely as possible):
This may be a complete coincidence, but this metric is the exact last one I have in Prometheus when sorted alphabetically. Try using Azure Blob and querying your last metric?
Anything else we need to know
I have already fixed this issue in the Azure library. I'm going to test it more, submit a PR to their repo, and then I'll submit a PR here to bump the version in go.mod
Additionally, I think it'd be worth looking into why using github.com/oklog/run doesn't report errors that the group encounters. Maybe we should add Error logging in the functions being run?
The text was updated successfully, but these errors were encountered:
Turns out that while, yes it would be nice if the Azure library autodetected we were trying to download bytes past the actual length of the file, I could just update azure.go to handle it preemptively :)
Thanos, Prometheus and Golang version used
Replicated on v0.3.2, v0.4.0, v0.6.0, and v0.6.1
What happened
Store gets an unexpected EOF when trying to read in a chunk because the Azure library uses an
io.ReadFull
call with an ending point that is longer than the file.This error does not get passed up the stack by oklog/run when the group is run in
func (s *BucketStore) Series(req *storepb.SeriesRequest, srv storepb.Store_SeriesServer) (err error)
, which made this more fun to debug 😄What you expected to happen
Either Thanos to not request an ending point that is larger than the file, or the Azure library to handle it gracefully.
I don't think Thanos knows ahead of time how big the chunk is, so I'm leaning towards Azure's fault. This leads me to think that:
How to reproduce it (as minimally and precisely as possible):
This may be a complete coincidence, but this metric is the exact last one I have in Prometheus when sorted alphabetically. Try using Azure Blob and querying your last metric?
Anything else we need to know
I have already fixed this issue in the Azure library. I'm going to test it more, submit a PR to their repo, and then I'll submit a PR here to bump the version in go.mod
Additionally, I think it'd be worth looking into why using github.com/oklog/run doesn't report errors that the group encounters. Maybe we should add Error logging in the functions being run?
The text was updated successfully, but these errors were encountered: