store-gateway: clean up chunks fetching, deprecate bucketed chunks pool #4996

dimitarvdimitrov · 2023-05-12T19:30:38Z

What this PR does

I started this as an internal refactor to make use of the existing offsetTrackingReader that #4926 introduced.

But it turned out that I can slightly refactor the chunks refetching logic and get rid of a lot of code (~800 LOC). I also covered it with some tests because it was untested until now. If it's easier to review, I can split this into multiple PRs.

This allowed to remove the method for fetching a single chunk range and all the related code blocks

pool.BucketedBytes
storegateway.byteRange
bucketBlock.readChunkRange

Because of this the following config options are no longer used and can be removed

-blocks-storage.bucket-store.max-chunk-pool-bytes
-blocks-storage.bucket-store.chunk-pool-min-bucket-size-bytes
-blocks-storage.bucket-store.chunk-pool-max-bucket-size-bytes

I marked them as deprecated but they are also unused. I think this falls out of line with our deprecation policy, but presently the options were only used when refetching chunks, which is a very rare scenario and weren't being respected in the regular case.

related to #3939

Checklist

Tests updated
Documentation added
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

dimitarvdimitrov · 2023-05-12T19:34:06Z

pkg/storegateway/bucket_chunk_reader.go

@@ -65,7 +67,7 @@ func (r *bucketChunkReader) addLoad(id chunks.ChunkRef, seriesEntry, chunkEntry
 	}
 	r.toLoad[seq] = append(r.toLoad[seq], loadIdx{
 		offset:      off,
-		length:      length,
+		length:      util_math.Max(varint.MaxLen32, length), // If the length is 0, we need to at least fetch the length of the chunk.


After adding tests, I discovered that the bucketChunkReader cannot handle 0 estimations even though 0 is a valid value according to this godoc

mimir/pkg/storegateway/series_refs.go

Line 184 in 14080d0

// length will be 0 when the length of the chunk isn't known

This wasn't causing bugs because length never would have been 0 as it is always set to a non-zero value here

mimir/pkg/storegateway/series_refs.go

Lines 973 to 988 in 14080d0

var chunkLen uint32

// We can only calculate the length of this chunk, if we know the ref of the next chunk

// and the two chunks are in the same segment file.

// We do that by taking the difference between the chunk references. This works since the chunk references are offsets in a file.

// If the chunks are in different segment files (unlikely, but possible),

// then this chunk ends at the end of the segment file, and we don't know how big the segment file is.

if nextRef, ok := nextChunkRef(partitions, pIdx, cIdx); ok && chunkSegmentFile(nextRef) == chunkSegmentFile(c.Ref) {

chunkLen = chunkOffset(nextRef) - chunkOffset(c.Ref)

if chunkLen > tsdb.EstimatedMaxChunkSize {

// Clamp the length in case chunks are scattered across a segment file. This should never happen,

// but if it does, we don't want to have an erroneously large length.

chunkLen = tsdb.EstimatedMaxChunkSize

}

} else {

chunkLen = tsdb.EstimatedMaxChunkSize

}

pracucci

Great job, LGTM! The bucketChunkReader.loadChunks() is easier to follow now 👍 I just left few minor comments.

I marked them as deprecated but they are also unused. I think this falls out of line with our deprecation policy, but presently the options were only used when refetching chunks, which is a very rare scenario and weren't being respected in the regular case.

Not a problem to me.

docs/sources/mimir/configure/about-versioning.md

pkg/storage/tsdb/config.go

pkg/storegateway/chunk_bytes_pool.go

pkg/storegateway/bucket_chunk_reader.go

pracucci · 2023-05-15T06:25:17Z

pkg/storegateway/bucket_chunk_reader.go

 			return errors.Wrap(err, "populate chunk")
 		}
 		localStats.chunksTouched++
-		localStats.chunksTouchedSizeSum += chunkLen + crc32.Size
+		localStats.chunksTouchedSizeSum += chunkEncDataLen


[nit] Previously we were also counting the varbit length and crc32.Size.

I'm not sure whether we should actually. It's bytes we throw away directly from the io.Reader. Is it different than discarding bytes of a different chunk?

I vaguely remember something: was the CRC added by you because you found out that for some special chunks the 4 bytes of the CRC was a significant % of the whole chunk data? 🤔

If we don't count the varbit and CRC length, could this be misleading when computing the "overfetching"?

yeah that makes sense. The crc is also accounted for here

mimir/pkg/storegateway/series_chunks.go

Line 653 in 14080d0

total += varint.UvarintSize(uint64(dataLen)) + 1 + dataLen + crc32.Size

so it makes sense to also count it in loadChunks

Shouldn't we also add varint.UvarintSize(uint64(dataLen)) for the same reason?

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

dimitarvdimitrov · 2023-05-15T09:18:32Z

thanks for the review @pracucci. I rebased after merging #4995 and addressed most of your comments. The one left is for whether or not we include the crc32 in the touched metrics.

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

pracucci

Thanks for addressing my comments. New commits LGTM. I also replied to your comment with another question.

docs/sources/mimir/configure/about-versioning.md

pracucci · 2023-05-15T12:24:48Z

pkg/storegateway/bucket_chunk_reader.go

 			return errors.Wrap(err, "populate chunk")
 		}
 		localStats.chunksTouched++
-		localStats.chunksTouchedSizeSum += chunkLen + crc32.Size
+		localStats.chunksTouchedSizeSum += chunkEncDataLen


I vaguely remember something: was the CRC added by you because you found out that for some special chunks the 4 bytes of the CRC was a significant % of the whole chunk data? 🤔

If we don't count the varbit and CRC length, could this be misleading when computing the "overfetching"?

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

dimitarvdimitrov · 2023-05-16T13:58:38Z

with this I will also remove the flag for chunks pool from the mimir jsonnet and helm. I pushed this commit to do it ebe56c5

dimitarvdimitrov · 2023-05-16T14:00:59Z

there is a strange 1k loc diff in the helm manifests. I will take a look later

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

dimitarvdimitrov · 2023-05-16T14:28:33Z

fixed. It was a matter of removing a local file and regenerating helm tests

pracucci · 2023-05-16T16:08:52Z

Still LGTM. I just left a final comment: #4996 (comment)

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

The following deprecated flags are removed: - `-blocks-storage.bucket-store.max-chunk-pool-bytes` - `-blocks-storage.bucket-store.chunk-pool-min-bucket-size-bytes` - `-blocks-storage.bucket-store.chunk-pool-max-bucket-size-bytes` See #4996 for more details.

* Remove -querier.query-ingesters-within config The `-querier.query-ingesters-within` config has been moved from a global config to a per-tenant limit config. See #4287 for more details. * Remove -querier.iterators and -querier.batch-iterators The `-querier.iterators` and `-querier.batch-iterators` configuration parameters have been removed. See #5114 for more details. * Remove deprecated bucket store flags The following deprecated flags are removed: - `-blocks-storage.bucket-store.max-chunk-pool-bytes` - `-blocks-storage.bucket-store.chunk-pool-min-bucket-size-bytes` - `-blocks-storage.bucket-store.chunk-pool-max-bucket-size-bytes` See #4996 for more details. * Remove -blocks-storage.bucket-store.bucket-index.enabled config This configuration parameter has been removed. Mimir is running with bucket index enabled by default since 2.0 and it is now not possible to disable it. See #5051 for more details. * Update CHANGELOG.md

dimitarvdimitrov added the component/store-gateway label May 12, 2023

dimitarvdimitrov requested review from a team as code owners May 12, 2023 19:30

dimitarvdimitrov commented May 12, 2023

View reviewed changes

pracucci self-requested a review May 15, 2023 04:55

pracucci approved these changes May 15, 2023

View reviewed changes

dimitarvdimitrov added 18 commits May 15, 2023 11:00

Add test for refetching chunks

8acc7eb

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Clean up chunks fetching

b232842

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Remove more unused structs from pool

f6e8ab1

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Fix error handling

d78c803

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Remove io

9de2d1a

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

nit on an error message

70ff43e

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Handle small estimations for chunk length

4682075

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Add CHANGELOG.md entry

5c456e1

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Update CHANGELOG.md

3a9ce79

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Remove more code

b40fbee

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Remove more code

ca98732

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Add license

3fe5dfc

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Update Update deprecation version

4b04282

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Mention removed metrics in CHANGELOG.md

4bbe45e

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Clarify retaining bytes

4f3d26b

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Update error message

116dafe

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Add sanity check to fetchChunkRemainder

645844a

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Reorder chunkEncDataLen components

ef21749

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

dimitarvdimitrov force-pushed the dimitar/st-gw/clean-up-chunks-fetching branch from f0d9c4e to ef21749 Compare May 15, 2023 09:16

Replace rename function usage

3c950a9

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

pracucci approved these changes May 15, 2023

View reviewed changes

dimitarvdimitrov added 2 commits May 15, 2023 16:53

Correct about-versioning.md

70e8c77

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Clarify version of removal in CHANGELOG.md

19c0f6d

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

dimitarvdimitrov added 2 commits May 15, 2023 16:57

Add crc32 to touched size

0e42fa2

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Remove chunk pool size limit from deployment configs

ebe56c5

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

revert faulty change

48067d8

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Also include chunk length varint size

cff1855

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

dimitarvdimitrov merged commit 942e16c into main May 16, 2023

dimitarvdimitrov deleted the dimitar/st-gw/clean-up-chunks-fetching branch May 16, 2023 16:44

This was referenced May 26, 2023

Ruler querier service option #5081

Closed

Add a generic chunk pool for batch.NewChunkMergeIterator #5110

Closed

dimitarvdimitrov mentioned this pull request Aug 20, 2024

Consider removing blocks-storage.bucket-store.max-chunk-pool-bytes limits #95

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

store-gateway: clean up chunks fetching, deprecate bucketed chunks pool #4996

store-gateway: clean up chunks fetching, deprecate bucketed chunks pool #4996

dimitarvdimitrov commented May 12, 2023 •

edited

Loading

dimitarvdimitrov May 12, 2023

pracucci left a comment

pracucci May 15, 2023

dimitarvdimitrov May 15, 2023 •

edited

Loading

pracucci May 15, 2023

dimitarvdimitrov May 15, 2023

pracucci May 16, 2023

dimitarvdimitrov commented May 15, 2023

pracucci left a comment

pracucci May 15, 2023

dimitarvdimitrov commented May 16, 2023 •

edited

Loading

dimitarvdimitrov commented May 16, 2023

dimitarvdimitrov commented May 16, 2023

pracucci commented May 16, 2023

	var chunkLen uint32
	// We can only calculate the length of this chunk, if we know the ref of the next chunk
	// and the two chunks are in the same segment file.
	// We do that by taking the difference between the chunk references. This works since the chunk references are offsets in a file.
	// If the chunks are in different segment files (unlikely, but possible),
	// then this chunk ends at the end of the segment file, and we don't know how big the segment file is.
	if nextRef, ok := nextChunkRef(partitions, pIdx, cIdx); ok && chunkSegmentFile(nextRef) == chunkSegmentFile(c.Ref) {
	chunkLen = chunkOffset(nextRef) - chunkOffset(c.Ref)
	if chunkLen > tsdb.EstimatedMaxChunkSize {
	// Clamp the length in case chunks are scattered across a segment file. This should never happen,
	// but if it does, we don't want to have an erroneously large length.
	chunkLen = tsdb.EstimatedMaxChunkSize
	}
	} else {
	chunkLen = tsdb.EstimatedMaxChunkSize
	}

store-gateway: clean up chunks fetching, deprecate bucketed chunks pool #4996

store-gateway: clean up chunks fetching, deprecate bucketed chunks pool #4996

Conversation

dimitarvdimitrov commented May 12, 2023 • edited Loading

What this PR does

Checklist

dimitarvdimitrov May 12, 2023

Choose a reason for hiding this comment

pracucci left a comment

Choose a reason for hiding this comment

pracucci May 15, 2023

Choose a reason for hiding this comment

dimitarvdimitrov May 15, 2023 • edited Loading

Choose a reason for hiding this comment

pracucci May 15, 2023

Choose a reason for hiding this comment

dimitarvdimitrov May 15, 2023

Choose a reason for hiding this comment

pracucci May 16, 2023

Choose a reason for hiding this comment

dimitarvdimitrov commented May 15, 2023

pracucci left a comment

Choose a reason for hiding this comment

pracucci May 15, 2023

Choose a reason for hiding this comment

dimitarvdimitrov commented May 16, 2023 • edited Loading

dimitarvdimitrov commented May 16, 2023

dimitarvdimitrov commented May 16, 2023

pracucci commented May 16, 2023

dimitarvdimitrov commented May 12, 2023 •

edited

Loading

dimitarvdimitrov May 15, 2023 •

edited

Loading

dimitarvdimitrov commented May 16, 2023 •

edited

Loading