-
Notifications
You must be signed in to change notification settings - Fork 565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
store-gateway: avoid unnecessarily writing lazy-loaded blocks snapshot #10740
base: main
Are you sure you want to change the base?
store-gateway: avoid unnecessarily writing lazy-loaded blocks snapshot #10740
Conversation
### Summary This PR introduces two key optimizations to the `indexheader.Snapshotter` component:SHA-1 Checksum Mechanism: Added a checksum-based optimization to avoid persisting unchanged JSON data, reducing unnecessary IOPS. Simplified Snapshot Format: Changed the snapshot format from `map[ulid.ULID]int64` to `map[ulid.ULID]struct{}`, eliminating unused timestamp data. ### Motivation In clusters with many tenants, the `Snapshotter` component can generate significant disk I/O by repeatedly writing identical JSON data to disk. This is particularly problematic on systems with low-performance disks. Additionally, the previous implementation stored timestamps that were never actually used in practice. ### Compatibility Considerations #### Backward Compatibility This approach maintains backward compatibility. `RestoreLoadedBlocks()` still reads the old format with timestamps and converts it to the new format. The JSON structure remains the same. This ensures that existing snapshots continue to work without requiring any migration. ### Forward Compatibility Forward compatibility is maintained because we continue to write the same JSON structure, just with simplified content. The `indexHeaderLastUsedTime` field is still present in the JSON, ensuring that older code can still parse the structure. Since older code only cares about the keys (block IDs) and not the values (timestamps), the change in value type doesn't affect functionality. Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes generally LGTM.
Do we need to preserve backwards compatibility? Seems like we could reduce some complexity by making a backwards incompatible change to store only the list of loaded blocks. The only downside to this that I see is that store-gateways will lazy-load all blocks when they first start after upgrading to this, but that seems acceptable to me if it gives us long-term simplicity.
@@ -152,16 +152,15 @@ func (p *ReaderPool) onLazyReaderClosed(r *LazyBinaryReader) { | |||
delete(p.lazyReaders, r) | |||
} | |||
|
|||
// LoadedBlocks returns a new map of lazy-loaded block IDs and the last time they were used in milliseconds. | |||
func (p *ReaderPool) LoadedBlocks() map[ulid.ULID]int64 { | |||
func (p *ReaderPool) LoadedBlocks() map[ulid.ULID]struct{} { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we just return a slice here? The only place that calls this is Snapshotter.PersistLoadedBlocks
, and it doesn't need a map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔥 Works for me.
It's the change a mimir user should see the effect of, right? Do you want to mention this change in the changelog, maybe?
I suspect, the backwards (and forwards compatibility) here is important, because, otherwise, this will hurt the rollout. I.e. in a busy cell, where queries actively touch older blocks, not loading the previous snapshots will cause the delays of the runtime queries. The same is for a hypothetical scenario, where we needed to roll back the change. |
my thinking was the same as @narqo. I see two approaches if we don't keep compatibility
Maybe a middle-ground is to support the previous format to read from, but not allow for rollbacks. That is - we read the old format, but from now on only write the new format. That way in a future release we can delete the support for the old format and drop this complexity. |
Summary
This PR introduces two key optimizations to the
indexheader.Snapshotter
Checksum Mechanism: Added a checksum-based optimization to avoid persisting unchanged JSON data, reducing unnecessary IOPS. We're using sha256 because the linters think sha1 is not secure
Simplified Snapshot Format: Changed the snapshot format from
map[ulid.ULID]int64
tomap[ulid.ULID]struct{}
, eliminating unused timestamp data.Motivation
In clusters with many tenants, the
Snapshotter
component can generate significant disk I/O by repeatedly writing identical JSON data to disk. This is particularly problematic on systems with low-performance disks. Additionally, the previous implementation stored timestamps that were never actually used in practice.Compatibility Considerations
Backward Compatibility
This approach maintains backward compatibility.
RestoreLoadedBlocks()
still reads the old format with timestamps and converts it to the new format. The JSON structure remains the same. This ensures that existing snapshots continue to work without requiring any migration.Forward Compatibility
Forward compatibility is maintained because we continue to write the same JSON structure, just with simplified content. The
indexHeaderLastUsedTime
field is still present in the JSON, ensuring that older code can still parse the structure. Since older code only cares about the keys (block IDs) and not the values (timestamps), the change in value type doesn't affect functionality.Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]
.about-versioning.md
updated with experimental features.