Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

store: Azure hits EOF when reading last metric in a chunk #1466

Closed
wbh1 opened this issue Aug 27, 2019 · 3 comments · Fixed by #1469
Closed

store: Azure hits EOF when reading last metric in a chunk #1466

wbh1 opened this issue Aug 27, 2019 · 3 comments · Fixed by #1469

Comments

@wbh1
Copy link
Contributor

wbh1 commented Aug 27, 2019

Thanos, Prometheus and Golang version used
Replicated on v0.3.2, v0.4.0, v0.6.0, and v0.6.1

What happened
Store gets an unexpected EOF when trying to read in a chunk because the Azure library uses an io.ReadFull call with an ending point that is longer than the file.

This error does not get passed up the stack by oklog/run when the group is run in
func (s *BucketStore) Series(req *storepb.SeriesRequest, srv storepb.Store_SeriesServer) (err error), which made this more fun to debug 😄

What you expected to happen
Either Thanos to not request an ending point that is larger than the file, or the Azure library to handle it gracefully.

I don't think Thanos knows ahead of time how big the chunk is, so I'm leaning towards Azure's fault. This leads me to think that:

		parts := r.block.partitioner.Partition(len(offsets), func(i int) (start, end uint64) {
			return uint64(offsets[i]), uint64(offsets[i]) + maxChunkSize
		})

How to reproduce it (as minimally and precisely as possible):
This may be a complete coincidence, but this metric is the exact last one I have in Prometheus when sorted alphabetically. Try using Azure Blob and querying your last metric?

Anything else we need to know
I have already fixed this issue in the Azure library. I'm going to test it more, submit a PR to their repo, and then I'll submit a PR here to bump the version in go.mod

Additionally, I think it'd be worth looking into why using github.com/oklog/run doesn't report errors that the group encounters. Maybe we should add Error logging in the functions being run?

@jojohappy
Copy link
Member

/cc @vglafirov

@wbh1
Copy link
Contributor Author

wbh1 commented Aug 28, 2019

Turns out that while, yes it would be nice if the Azure library autodetected we were trying to download bytes past the actual length of the file, I could just update azure.go to handle it preemptively :)

@devalexx
Copy link

devalexx commented Sep 19, 2019

+1, currently Azure integration is completely broken. please, approve at least this temporary solution (by @wbh1) as it works for me too

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants