Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable sharding for active_series requests #6784

Merged
merged 27 commits into from
Jan 4, 2024
Merged

Conversation

flxbk
Copy link
Contributor

@flxbk flxbk commented Nov 30, 2023

What this PR does

To build a response for an /active_series request, queriers need to hold the full series set in memory for deduplication. For selectors that return a large set of series this can consume a lot of memory, as allocations are proportional to the size of the set.

This PR allows sharding these requests in the frontend such that each querier only needs to bring part of the series set into memory for deduplication. The frontend then interleaves the partial responses into a single set that is returned back to the client. It also introduces a response size limit for active series responses in queriers to prevent unbounded allocation and OOMs.

This PR also introduces a dedicated roundtripper for active series requests to bypass the generic cache. Since this endpoint is supposed to return "fresh" data, setups with a large cache TTL config could yield outdated results if the cache is enabled and not manually bypassed for querying.

Which issue(s) this PR fixes or relates to

n/a

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • about-versioning.md updated with experimental features.

@flxbk flxbk force-pushed the felix/shard-active-series branch 17 times, most recently from 4e94fba to 0c08d3c Compare December 4, 2023 11:53
@flxbk flxbk marked this pull request as ready for review December 4, 2023 15:59
@flxbk flxbk requested review from a team as code owners December 4, 2023 15:59
@Logiraptor
Copy link
Contributor

In an earlier version of this PR I used a tenant's query_sharding_total_shards to set the shard count. This proved very inefficient, as queriers would be flooded with requests that spend most of their life waiting in the queue.

Interesting, I guess that makes sense in hindsight, since this API is used for a huge range of response sizes. Even in huge tenants, many requests will return <10 series for example, and sharding that 12 times is mostly overhead.


This looks great! Thanks for working on it. My main feedback is about the potential for over-sharding. Say for example I send a request with Sharding-Control: 1000000, I believe the code as written will open 1M parallel requests which could be an issue. I can think of two solutions to consider:

  1. We could bound the number of shards to query-frontend.query-sharding-max-sharded-queries (which defaults to 128). Arguably this limit is meant for promql, so I'm not sure in practice if that default also makes sense for active series queries. 128 * 5MB response limit is 640MB, which sounds like a lot, but I'm not sure how that translates to number of series. 🤔
  2. We could use concurrency.ForEachJob from dskit to process the shard requests with some bounded level of parallelism.

WDYT?

Comment on lines -414 to 424
strings.HasSuffix(path, cardinalityLabelValuesPathSuffix) ||
strings.HasSuffix(path, cardinalityActiveSeriesPathSuffix)
strings.HasSuffix(path, cardinalityLabelValuesPathSuffix)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the side-effects here is that we no longer apply the newCardinalityQueryCacheRoundTripper middleware to active series endpoint, is that intended?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's intended, I've tried to explain the reasoning in the PR description.

This PR also introduces a dedicated roundtripper for active series requests to bypass the generic cache. Since this endpoint is supposed to return "fresh" data, setups with a large cache TTL config could yield outdated results if the cache is enabled and not manually bypassed for querying.

Copy link
Contributor

@colega colega left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, great work.

As discussed on Slack, my main concern is that there's nothing actually checking that we're not loading all shards into memory before responding. Can we add a test for that? You can orchestrate the next middlewares in the way that we check we're getting a response before the second shard's next starts writing its json.

@colega
Copy link
Contributor

colega commented Jan 4, 2024

Thanks for adding the test.

@flxbk flxbk merged commit 7de79e5 into main Jan 4, 2024
28 checks passed
@flxbk flxbk deleted the felix/shard-active-series branch January 4, 2024 11:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants