-
Notifications
You must be signed in to change notification settings - Fork 529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable sharding for active_series
requests
#6784
Conversation
4e94fba
to
0c08d3c
Compare
7fffcc3
to
cdfac0d
Compare
Interesting, I guess that makes sense in hindsight, since this API is used for a huge range of response sizes. Even in huge tenants, many requests will return <10 series for example, and sharding that 12 times is mostly overhead. This looks great! Thanks for working on it. My main feedback is about the potential for over-sharding. Say for example I send a request with
WDYT? |
6f1f55d
to
9935b3b
Compare
strings.HasSuffix(path, cardinalityLabelValuesPathSuffix) || | ||
strings.HasSuffix(path, cardinalityActiveSeriesPathSuffix) | ||
strings.HasSuffix(path, cardinalityLabelValuesPathSuffix) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the side-effects here is that we no longer apply the newCardinalityQueryCacheRoundTripper
middleware to active series endpoint, is that intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's intended, I've tried to explain the reasoning in the PR description.
This PR also introduces a dedicated roundtripper for active series requests to bypass the generic cache. Since this endpoint is supposed to return "fresh" data, setups with a large cache TTL config could yield outdated results if the cache is enabled and not manually bypassed for querying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, great work.
As discussed on Slack, my main concern is that there's nothing actually checking that we're not loading all shards into memory before responding. Can we add a test for that? You can orchestrate the next
middlewares in the way that we check we're getting a response before the second shard's next
starts writing its json.
4ac2a4e
to
211723d
Compare
Thanks for adding the test. |
What this PR does
To build a response for an
/active_series
request, queriers need to hold the full series set in memory for deduplication. For selectors that return a large set of series this can consume a lot of memory, as allocations are proportional to the size of the set.This PR allows sharding these requests in the frontend such that each querier only needs to bring part of the series set into memory for deduplication. The frontend then interleaves the partial responses into a single set that is returned back to the client. It also introduces a response size limit for active series responses in queriers to prevent unbounded allocation and OOMs.
This PR also introduces a dedicated roundtripper for active series requests to bypass the generic cache. Since this endpoint is supposed to return "fresh" data, setups with a large cache TTL config could yield outdated results if the cache is enabled and not manually bypassed for querying.
Which issue(s) this PR fixes or relates to
n/a
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]
.about-versioning.md
updated with experimental features.