-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mark blob files not needed by any memtables/SSTs obsolete #6032
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ltamasi has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
3bad70b
to
2eaa690
Compare
@ltamasi has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ltamasi has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By reading the code, I didn't quite figure out how this handles a blob file with blobs only linked to unflushed memtables. But I must have missed something and will discuss with you offline.
Yeah, that could be an issue. Will look into it and discuss with you. |
…utable, cleanups (initialize obsolete_sequence_ in ctor, remove unused member)
…ile if the sequence number at which it was marked immutable has not been flushed yet
… file does not reference any blob files)
@ltamasi has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ltamasi has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@ltamasi has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ltamasi has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@ltamasi has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ltamasi has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
open_non_ttl_file_ = nullptr; | ||
} | ||
} | ||
const SequenceNumber sequence = GetLatestSequenceNumber(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is OK for now but it's interesting to see when we start to implement blob copying in compactions, whether we want to copy to a separate blob file. If these blobs copied out go to a separate files, those file would have much older sequence numbers. It may not be a big deal even we take latest sequence number though.
) Summary: The patch adds logic to mark no longer needed blob files obsolete upon database open and whenever a flush or compaction completes. Unneeded blob files are detected by iterating through live immutable non-TTL blob files starting from the lowest-numbered one, and stopping when a blob file used by any SSTs or potentially used by memtables is found. (The latter is determined by comparing the sequence number at which the blob file became immutable with the largest sequence number received in flush notifications.) In addition, the patch cleans up the logic around closing and obsoleting blob files and enforces invariants around this area (blob files are now guaranteed to go through the stages mutable-non-obsolete, immutable-non-obsolete, and immutable-obsolete in this order). Pull Request resolved: facebook#6032 Test Plan: Extended unit tests and tested using the BlobDB mode of `db_bench`. Differential Revision: D18495610 Pulled By: ltamasi fbshipit-source-id: 11825b84af74f3f4abfd9bcae04e80870ae58961
Summary:
The patch adds logic to mark no longer needed blob files obsolete upon database open
and whenever a flush or compaction completes. Unneeded blob files are detected by
iterating through live immutable non-TTL blob files starting from the lowest-numbered one,
and stopping when a blob file used by any SSTs or potentially used by memtables is found.
(The latter is determined by comparing the sequence number at which the blob file
became immutable with the largest sequence number received in flush notifications.)
In addition, the patch cleans up the logic around closing and obsoleting blob files and
enforces invariants around this area (blob files are now guaranteed to go through the
stages mutable-non-obsolete, immutable-non-obsolete, and immutable-obsolete in this
order).
Test Plan:
Extended unit tests and tested using the BlobDB mode of
db_bench
.