-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
query: Panic when 2 thanos-query are connected? #4743
Comments
I tried this setup but cannot reproduce this issue. |
ok thanks I will dig further to try to reproduce with minimal steps. |
It definitely seems like we're trying to access something that's not there at https://github.com/thanos-io/thanos/blob/release-0.23/pkg/api/query/v1.go#L717. My guess is that @hitanshu-mehta any idea if this make sense? |
Hum so I found a misconfig, where thanos was trying to target an IP:port where no thanos-query was running. |
@ahurtaud the scenario you mention would actually make sense to me - if you specify a host:port address where there is nothing running (or something not expected), I would guess the status of the endpoint would only include error, hence the panic when trying to obtain component type. Whether it is an accidental misconfiguration or not, we should not be panicking, this is a valid bug. |
So I was able to reproduce it fairly reliably - I adjusted the E2E test for query and threw in a couple of 'made-up' addresses for store which were always returning errors. If I tried to do call to the |
{"address":"10.0.0.248:10901","caller":"endpointset.go:525","component":"endpointset","err":"getting metadata: fallback fetching info from 10.0.0.248:10901: rpc error: code = DeadlineExceeded desc = context deadline exceeded","level":"warn","msg":"update of node failed","ts":"2021-10-11T12:44:35.206298133Z"}
{"address":"10.0.0.212:10901","caller":"endpointset.go:525","component":"endpointset","err":"getting metadata: fallback fetching info from 10.0.0.212:10901: rpc error: code = DeadlineExceeded desc = context deadline exceeded","level":"warn","msg":"update of node failed","ts":"2021-10-11T12:44:35.206377728Z"}
{"address":"10.0.0.248:10901","caller":"endpointset.go:525","component":"endpointset","err":"getting metadata: fallback fetching info from 10.0.0.248:10901: rpc error: code = DeadlineExceeded desc = context deadline exceeded","level":"warn","msg":"update of node failed","ts":"2021-10-11T12:44:40.206936877Z"}
{"address":"10.0.0.212:10901","caller":"endpointset.go:525","component":"endpointset","err":"getting metadata: fallback fetching info from 10.0.0.212:10901: rpc error: code = DeadlineExceeded desc = context deadline exceeded","level":"warn","msg":"update of node failed","ts":"2021-10-11T12:44:40.207256941Z"}
{"address":"10.0.0.212:10901","caller":"endpointset.go:525","component":"endpointset","err":"getting metadata: fallback fetching info from 10.0.0.212:10901: rpc error: code = DeadlineExceeded desc = context deadline exceeded","level":"warn","msg":"update of node failed","ts":"2021-10-11T12:44:45.209680178Z"}
{"address":"10.0.0.248:10901","caller":"endpointset.go:525","component":"endpointset","err":"getting metadata: fallback fetching info from 10.0.0.248:10901: rpc error: code = DeadlineExceeded desc = context deadline exceeded","level":"warn","msg":"update of node failed","ts":"2021-10-11T12:44:45.209708408Z"}
{"address":"10.0.0.248:10901","caller":"endpointset.go:525","component":"endpointset","err":"getting metadata: fallback fetching info from 10.0.0.248:10901: rpc error: code = DeadlineExceeded desc = context deadline exceeded","level":"warn","msg":"update of node failed","ts":"2021-10-11T12:44:50.211450445Z"}
{"address":"10.0.0.212:10901","caller":"endpointset.go:525","component":"endpointset","err":"getting metadata: fallback fetching info from 10.0.0.212:10901: rpc error: code = DeadlineExceeded desc = context deadline exceeded","level":"warn","msg":"update of node failed","ts":"2021-10-11T12:44:50.211469304Z"} |
What happened:
Thanos v0.23.1:
We have one thanos query, connected to a list of other thanos-query as --store in gRPC secured.
What you expected to happen:
as of before v0.23.1 (0.22.0), the thanos store page to list the available thanos query.
However the querying is working fine, it can query data from the registered stores (query).
How to reproduce it (as minimally and precisely as possible):
I think:
Having one thanos-query registering another thanos-query with
- --store=x.x.x.x:10901"
flag.open the thanos stores page to list the registered stores component.
Full logs to relevant components:
Please open the following Panic logs:
Anything else we need to know:
I suspect the newly-removed info gRPC endpoint about metadata I quickly followed on slack :/
The text was updated successfully, but these errors were encountered: