Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move pipeline write waiting logic into WaitForPendingWrites() #5716

Closed
wants to merge 2 commits into from

Conversation

siying
Copy link
Contributor

@siying siying commented Aug 16, 2019

Summary:
In pipeline writing mode, memtable switching needs to wait for memtable writing to finish to make sure that when memtables are made immutable, inserts are not going to them. This is currently done in DBImpl::SwitchMemtable(). This is done after flush_scheduler_.TakeNextColumnFamily() is called to fetch the list of column families to switch. The function flush_scheduler_.TakeNextColumnFamily() itself, however, is not thread-safe when being called together with flush_scheduler_.ScheduleFlush().
This change provides a fix, which moves the waiting logic before flush_scheduler_.TakeNextColumnFamily(). WaitForPendingWrites() is a natural place where the logic can happen.

Test Plan: Run all tests with ASAN and TSAN.

@siying
Copy link
Contributor Author

siying commented Aug 16, 2019

CC @yiwu-arbug

@miasantreble
Copy link
Contributor

How about re-enable pipeline writes like PR #5636 did and run TSAN to make sure it's now thread safe?

@yiwu-arbug
Copy link
Contributor

LGTM overall, but don't quite understand what's the issue with flush scheduler. I think it is thread-safe by itself?

@siying
Copy link
Contributor Author

siying commented Aug 19, 2019

@yiwu-arbug multiple threads calling ScheduleFlush() is thread-safe, but TakeNextColumnFamily() is not.

@siying
Copy link
Contributor Author

siying commented Aug 19, 2019

@miasantreble I did run tsan on my dev box and it passed. Let me turn it on in crash test.

@siying
Copy link
Contributor Author

siying commented Aug 19, 2019

@yiwu-arbug it's also that, getting list of CFs whose memtable is full before waiting is less straight-forward to reason about. It means that, after waiting for the memtable writes finish, we might leave some full memtables unflushed, and restart the pipeline again. It might not be necessarily wrong, but harder to prove.

@yiwu-arbug
Copy link
Contributor

@yiwu-arbug multiple threads calling ScheduleFlush() is thread-safe, but TakeNextColumnFamily() is not.

One issue I can see is TakeNextColumnFamily() not using CAS operation to modify head_. Are you referring to this issue? If so, I think we should fix it.

@yiwu-arbug
Copy link
Contributor

@yiwu-arbug it's also that, getting list of CFs whose memtable is full before waiting is less straight-forward to reason about. It means that, after waiting for the memtable writes finish, we might leave some full memtables unflushed, and restart the pipeline again. It might not be necessarily wrong, but harder to prove.

agree.

@siying
Copy link
Contributor Author

siying commented Sep 25, 2019

ping @maysamyabandeh

@maysamyabandeh
Copy link
Contributor

@yiwu-arbug if it looks good to you, can you approve the PR?

Summary:
In pipeline writing mode, memtable switching needs to wait for memtable writing to finish to make sure that when memtables are made immutable, inserts are not going to them. This is currently done in DBImpl::SwitchMemtable(). This is done after flush_scheduler_.TakeNextColumnFamily() is called to fetch the list of column families to switch. The function flush_scheduler_.TakeNextColumnFamily() itself, however, is not thread-safe when being called together with flush_scheduler_.ScheduleFlush().
This change provides a fix, which moves the waiting logic before flush_scheduler_.TakeNextColumnFamily(). WaitForPendingWrites() is a natural place where the logic can happen.

Test Plan: Run all tests with ASAN and TSAN.
@siying siying force-pushed the pipeline_wait_write branch from d6c1ad9 to 13f7391 Compare October 29, 2019 20:11
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@siying is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in a3960fc.

merryChris pushed a commit to merryChris/rocksdb that referenced this pull request Nov 18, 2019
…ok#5716)

Summary:
In pipeline writing mode, memtable switching needs to wait for memtable writing to finish to make sure that when memtables are made immutable, inserts are not going to them. This is currently done in DBImpl::SwitchMemtable(). This is done after flush_scheduler_.TakeNextColumnFamily() is called to fetch the list of column families to switch. The function flush_scheduler_.TakeNextColumnFamily() itself, however, is not thread-safe when being called together with flush_scheduler_.ScheduleFlush().
This change provides a fix, which moves the waiting logic before flush_scheduler_.TakeNextColumnFamily(). WaitForPendingWrites() is a natural place where the logic can happen.
Pull Request resolved: facebook#5716

Test Plan: Run all tests with ASAN and TSAN.

Differential Revision: D18217658

fbshipit-source-id: b9c5e765c9989645bf10afda7c5c726c3f82f6c3
yiwu-arbug pushed a commit to tikv/rocksdb that referenced this pull request Dec 17, 2019
…ok#5716)

Summary:
In pipeline writing mode, memtable switching needs to wait for memtable writing to finish to make sure that when memtables are made immutable, inserts are not going to them. This is currently done in DBImpl::SwitchMemtable(). This is done after flush_scheduler_.TakeNextColumnFamily() is called to fetch the list of column families to switch. The function flush_scheduler_.TakeNextColumnFamily() itself, however, is not thread-safe when being called together with flush_scheduler_.ScheduleFlush().
This change provides a fix, which moves the waiting logic before flush_scheduler_.TakeNextColumnFamily(). WaitForPendingWrites() is a natural place where the logic can happen.
Pull Request resolved: facebook#5716

Test Plan: Run all tests with ASAN and TSAN.

Differential Revision: D18217658

fbshipit-source-id: b9c5e765c9989645bf10afda7c5c726c3f82f6c3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants