Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't load series multiple times when streaming chunks from store-gateways and only one batch is needed #8039

Merged
merged 45 commits into from
May 23, 2024

Conversation

charleskorn
Copy link
Contributor

@charleskorn charleskorn commented May 3, 2024

What this PR does

Acknowledgement: this has been a joint effort between myself and @zenador

This PR improves the performance of store-gateways when streaming chunks to queriers.

Specifically, it improves the performance of the case where a Select() call selects no more than than -blocks-storage.bucket-store.batch-series-size series (default value is 5000).

Currently this case behaves as follows (assuming the index-cache is empty initially):

  • When a Select() call arrives, the store-gateway needs to first send all selected series' labels to the querier:
    • It calls BucketStore.streamingSeriesForBlocks(), which creates an iterator for the seriesChunkRefsSets for the query. Each seriesChunkRefsSet contains at most -blocks-storage.bucket-store.batch-series-size series.
    • As this iterator is consumed, the series in each seriesChunkRefsSet are loaded and cached as SeriesForRef items in the index-cache. These cache entries are written asynchronously.
  • Once all labels have been sent, it then begins streaming the chunks for those series:
    • This calls BucketStore.streamingChunksSetForBlocks(), which creates the same iterator again (albeit this time asking for chunk refs to be loaded as well)
    • As this iterator is consumed, it will try to fetch the cached series, and if a cache miss occurs, falls back to loading these directly as before.

This last part is the opportunity for improvement: it's very likely that the asynchronous cache write is yet to complete at this point, so the store-gateway will end up loading the series again. However, in the case where the query only needs one batch, recreating the batch is unnecessary: we can just keep the batch in memory and reuse it for both sending labels and sending chunks. (We can't do this if multiple batches are required: that would require keeping all batches in memory, which we are trying to avoid.)

⚠️ It has taken me quite a while for me to get my head around this part of the store-gateway - please review this change carefully, don't assume I know what I'm doing.

Which issue(s) this PR fixes or relates to

Fixes the issue discussed in #6646 (comment)

Checklist

  • Tests updated.
  • [n/a] Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • [n/a] about-versioning.md updated with experimental features.

@charleskorn charleskorn changed the title Don't load series refs multiple times when streaming chunks from store-gateways and only one batch is needed Don't load series multiple times when streaming chunks from store-gateways and only one batch is needed May 3, 2024
@charleskorn charleskorn closed this May 6, 2024
@charleskorn charleskorn reopened this May 13, 2024
Copy link
Contributor

@dimitarvdimitrov dimitarvdimitrov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewers: I don't love how invasive this change was, open to ideas and feedback on how to better isolate this. Perhaps we could wrap loadingSeriesChunkRefsSetIterator in another iterator type that either caches the single set or recreates the iterator when we start streaming chunks?

I tend to agree. The logic for what is being reset (in streamingChunksSetForBlocks) and how the strategy and "releasability" change is selected (in loadingSeriesChunkRefsSetIterator.Reset()) are very far away in the call chain.

Maybe we can have something like an iterator factory: instead of letting getSeriesIteratorFromBlocks call directly openBlockSeriesChunkRefsSetsIterator we can pass a factory to getSeriesIteratorFromBlocks.

  1. The first time getSeriesIteratorFromBlocks invokes the factory the factory creates empty iterators.
  2. The factory can make the decision whether there will be only a single batch for the block or multiple (which due to sharding and labels filtering won't be accurate, but it might work most of the time).
  3. If there is only a single batch, then the factory returns an iterator which returns unreleasable batch and caches the result.
  4. In the second invocation from getSeriesIteratorFromBlocks the factory reuses the cached result and returns a releasable batch.
  5. Done?

this way the factory and the caching iterator it returns will be the major place which decide short-term caching, releasability, and the seriesIteratorStrategy

pkg/storegateway/series_chunks.go Outdated Show resolved Hide resolved
pkg/storegateway/series_refs.go Outdated Show resolved Hide resolved
pkg/storegateway/series_refs.go Outdated Show resolved Hide resolved
pkg/storegateway/series_refs.go Outdated Show resolved Hide resolved
pkg/storegateway/series_refs.go Outdated Show resolved Hide resolved
Copy link
Contributor

@dimitarvdimitrov dimitarvdimitrov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice one! I also like this approach better. it looks good for the most part, the only major thing is that the postings are no longer copied before reusing them. Can you double-check if that's causing bugs?

pkg/storegateway/bucket.go Outdated Show resolved Hide resolved
pkg/storegateway/bucket.go Outdated Show resolved Hide resolved
pkg/storegateway/series_refs_streaming.go Outdated Show resolved Hide resolved
pkg/storegateway/series_refs_streaming.go Show resolved Hide resolved
pkg/storegateway/series_refs.go Show resolved Hide resolved
pkg/storegateway/series_refs.go Outdated Show resolved Hide resolved
pkg/storegateway/series_refs.go Outdated Show resolved Hide resolved
pkg/storegateway/series_refs_streaming_test.go Outdated Show resolved Hide resolved
@charleskorn charleskorn marked this pull request as ready for review May 20, 2024 10:08
@charleskorn charleskorn requested a review from a team as a code owner May 20, 2024 10:08
@charleskorn
Copy link
Contributor Author

I have tested this in a test cluster at Grafana Labs and it seems to work fine, so I think this is ready for a final review before merging 🎉

Copy link
Contributor

@dimitarvdimitrov dimitarvdimitrov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, awesome work @charleskorn and @zenador! :shipit:

@dimitarvdimitrov dimitarvdimitrov merged commit 2a0da30 into main May 23, 2024
29 checks passed
@dimitarvdimitrov dimitarvdimitrov deleted the charleskorn/store-gateway-double-work-fix branch May 23, 2024 18:28
narqo pushed a commit to narqo/grafana-mimir that referenced this pull request Jun 6, 2024
…eways and only one batch is needed (grafana#8039)

Co-authored-by: Jeanette Tan <jeanette.tan@grafana.com>
charleskorn added a commit that referenced this pull request Jun 17, 2024
…6646)

* Enable streaming chunks from store-gateways to queriers by default.

# Conflicts:
#	cmd/mimir/help-all.txt.tmpl
#	docs/sources/mimir/configure/about-versioning.md
#	pkg/querier/querier.go

* Add changelog entry.

# Conflicts:
#	CHANGELOG.md

* Fix failing TestQuerierWithBlocksStorageOnMissingBlocksFromStorage test with store-gateway chunks streaming enabled.

* Partially fix cache-related assertions in TestQuerierWithBlocksStorageRunningInSingleBinaryMode

* Address PR feedback, and add more details to failing assertions

# Conflicts:
#	integration/querier_test.go

* Update tests to reflect behaviour of chunks streaming with #8039 in place.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants