-
Notifications
You must be signed in to change notification settings - Fork 529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve compactor job estimation by tracking external labels in bucket index #7745
Conversation
…n match compaction.
… as it exists in block labels.
…compaction-planner and estimator agree.
I welcome a review from anyone, but I will be waiting for @pstibrany to approve. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your work on this PR. Changes look good to me, I only left some minor comments.
This estimation will remain inaccurate for blocks with external labels that aren't yet present in the bucket index. I believe this will stick around until those blocks are recompacted or retentioned out.
Correct. Bucket index updater code will not update blocks that are already in the index, but don't have their labels stored. For that we would need to increase bucket version, then bucket index updater would refetch all meta.json files and fix labels for all blocks. That would be a one-time operation for every tenant.
Given the low impact of the problem we're fixing by this PR, we may not want to do that in this PR, and live with wrong estimated jobs count until all blocks are naturally fixed.
That sounds rational to me- especially as I understand the current counting problem isn't really causing mass confusion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
What this PR does
This PR adds a
Labels
field to the bucket index'sBlock
struct. Compactor now maintains this field in the bucket index when it writes a block. This allows job estimation metrics (cortex_bucket_index_estimated_compaction_jobs
) andtools/compaction-planner
to better reflect outstanding compaction jobs because the compactor is only going to compact blocks that have the same external labels.As we're now copying external block labels into the bucket index, this PR also makes the estimation match compactor's job planning by ignoring the deprecated
__org_id__
and__ingester_id__
labels.While the bucket index may have information about many blocks, each block today will only have <= 2 external labels, so concerns about blowing up the bucket index are minimal for now. But a close eye will need to be kept if any new labels are to be added later.
This estimation will remain inaccurate for blocks with external labels that aren't yet present in the bucket index. I believe this will stick around until those blocks are recompacted or retentioned out.
Which issue(s) this PR fixes or relates to
Fixes #7318
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]
.about-versioning.md
updated with experimental features.