-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix point lookup on range tombstone sentinel endpoint #4829
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
This is a corruption issue and it's in 5.18.fb. It is not in 5.17.fb or earlier. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might be able to solve this in a cleaner way by modifying FindFileInRange
to increment by one the limit passed to lower_bound
. That lower_bound
uses internal comparator so it should be able to pass over a file ending in range tombstone sentinel key when the lookup key has the same user key.
@ajkr has updated the pull request. Re-import the pull request |
092caf7
to
61b8571
Compare
@ajkr has updated the pull request. Re-import the pull request |
OK, it is improved now that it reuses the existing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@ajkr has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Yeah I'm happy with this approach. It should only prevent cascading in cases where the lookup key definitely does not exist in the file range chosen while making comparisons at the previous level. For example, if the lookup key is outside the level's range, or the lookup key is within a file's range from user-key perspective but not from internal key perspective. Arguably it's better to not let |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm.
@ajkr has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ajkr is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary: Previously for point lookup we decided which file to look into based on user key overlap only. We also did not truncate range tombstones in the point lookup code path. These two ideas did not interact well in cases like this: - L1 has range tombstone [a, c)#1 and point key b#2. The data is split between file1 with range [a#1,1, b#72057594037927935,15], and file2 with range [b#2, c#1]. - L1's file2 gets compacted to L2. - User issues `Get()` for b#3. - L1's file1 is opened and the range tombstone [a, c)#1 is found for b, while no point-key for b is found in L1. - `Get()` assumes that the range tombstone must cover all data in that range in lower levels, so short circuits and returns `NotFound`. The solution to this problem is to not look into files that only overlap with the point lookup at a range tombstone sentinel endpoint. In the above example, this would mean not opening L1's file1 or its tombstones during the `Get()`. Pull Request resolved: #4829 Differential Revision: D13561355 Pulled By: ajkr fbshipit-source-id: a13c21c816870a2f5d32a48af6dbd719a7d9d19f
Previously for point lookup we decided which file to look into based on user key overlap only. We also did not truncate range tombstones in the point lookup code path. These two ideas did not interact well in cases like this:
Get()
for b#3.Get()
assumes that the range tombstone must cover all data in that range in lower levels, so short circuits and returnsNotFound
.The solution to this problem is to not look into files that only overlap with the point lookup at a range tombstone sentinel endpoint. In the above example, this would mean not opening L1's file1 or its tombstones during the
Get()
.Test Plan: