-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1.4.2 turner #97
1.4.2 turner #97
Conversation
…condition on close of async files and remove tons of redundant code.
…e threads move to fast for Google unit tests. use of volatile needed. or I could be wrong. checking in to move off laptop.
…r file to resolve.
…s do that naturally. Adjust code
I see a test failure running
|
cache leak test fails 3 out of 4 times ... minimum. Jon has previously said that we should likely remove it. But whenever I get ready to do that ... it seems to work again. |
I did a Riak CS riak_test run using this branch for a functional sanity check and all tests passed. |
👍 on the PosixMmapFile changes |
for (size_t i = 0; i < filenames.size(); i++) { | ||
KeepOrDelete(filenames[i], level, live); | ||
KeepOrDelete(filenames[i], -1, live); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's up with the -1 here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The github differencing logic is piecing stuff from high and low. The real change is that I inserted "if (RunningCompactionCount()<2)", then tabbed everything else in the function in by one tab. Old (red) line 248 is now new (green) line 256. Move 7 lines down in red, then in green ... both have same "KeepOrDelete(filenames[i], -1, live)". Move 19 lines down from the starting point. You will see "KeepOrDelete(filenames[i], level, live);" in both.
FYI, Jon has asked me to look over this PR as well. Just finished reading description in wiki, digging into the code now. Since Kelly/Andrew are already reviewing, I'm going to try to work though things quickly. Jon simply wanted someone who was familiar with the async eleveldb threading changes (eg. me) to be in the loop. |
Thanks @jtuple. After looking over this code I don't see any obvious errors. Being that my c++ is rusty and I don't quite understand all the logic here I'm going to give this a 'not -1' h/t @kellymclaughlin |
++running_compactions_; | ||
Log(options_.info_log, "Background compact start: %u", running_compactions_); | ||
|
||
assert(IsCompactionScheduled()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor note: this assertion should probably be changed to only check for bg_compaction_scheduled_
as the other condition that makes IsCompactScheduled
true is a non-NULL imm
, but imm
compaction isn't handled by BackgroundCall
anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no. Today, bg_compaction_scheduled_ is a flag for any compaction BUT imm compaction. This routine should only be called for a non-imm compaction AND it should only be called when bg_compaction_scheduled_ is set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you. This assertion isn't checking that case. It's checking IsCompactionScheduled()
which is true when bg_compaction_scheduled_
OR imm != NULL
. If we somehow end up in this function when bg_compaction_scheduled = false
but imm != NULL
, the assertion should fire. Obviously, this doesn't affect things in production, but seems helpful to have proper assertion to help with future debugging in later code changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
argh. I misread everything. assertion needs to change.
Compared |
When comparing Consider the code in hot_threads.cc#L108-L114: // only wait if we are really sure no work pending
if (0==m_Pool.m_WorkQueueAtomic)
{
// yes, thread going to wait. set available now.
m_Available=1;
m_Condition.Wait();
} // if As well as the code in hot_threads.cc#L260-L268: // no waiting threads, put on backlog queue
{
SpinLock lock(&m_QueueLock);
inc_and_fetch(&m_WorkQueueAtomic);
m_WorkQueue.push_back(item);
}
// to address race condition, thread might be waiting now
FindWaitingThread(NULL); If the worker thread is preempted after performing the conditional check, but before setting /* thread 1: */ if (0==m_Pool.m_WorkQueueAtomic)
/* thread 2: */ inc_and_fetch(&m_WorkQueueAtomic);
/* thread 2: */ m_WorkQueue.push_back(item);
/* thread 2: */ FindWaitingThread(NULL);
/* thread 1: */ m_Available=1;
/* thread 1: */ m_Condition.Wait(); Thus, thread 2 queues work, tries to find an available thread, fails to find thread 1 and returns. Afterwards, thread 1 goes to sleep, never to be signaled. I'm not sure how likely this is to ever occur. But, I was able to reliably hit this condition by using a 1-thread pool and adding a small busywait wait before the To make things clear, I added additional logging in this commit, where thread events are logged to A combined log of thread + LevelDB events for a deadlock scenario is here. As shown, we get into a state where we hit All in all, I have no idea how likely this race is to matter in the real world, with >1 background thread and no artificial delay. But, think it is still a remote possibility. Thoughts? |
Joe comment #1: the two "new" functions were added as part of mv-flexcache branch of Riak 2.0. I pulled the file from there. But you are correct that you might be the first person reviewing them. Joe comment #2: I agree with your assessment. The chance of the stall in eleveldb today is very slim since there are 71 threads (a prime number larger than 64 which is a common vnode count). The thread counts will be much lower in leveldb's usage. This is a bug that needs to be addressed. I am comfortable with it going to this targeted, high volume customer. But low traffic sites are susceptible. |
Spent a lot of time pushing this branch both inside and outside of Riak. In my non-Riak testing, I used a micro-benchmarck that tested having 10 to 128 concurrent open DBs with 10 to 400 concurrent Erlang processes per DB issuing reads/writes. This is much more concurrency than Riak normally can provide. Multiple runs over 2 hours along with different object sizes, read/write ratios, etc all worked without a hitch. Wasn't testing performance -- was trying to hit any race conditions or other concurrency issues. This runs alternated from starting with fresh DBs on each run, to reusing a data created on a prior run. Likewise, I also 'kill -9' the Erlang VM on several of these runs, such that the follow-up run was starting from data that needed to run recovery. Things appeared to work as intended. Looking at the LOGs shows recovery kicking in when needed on startup, shows both background imm writes and compactions occurring as necessary, etc. I've carefully looked over the new Once this is stabilized, I do think code cleanup is necessary not only for "cleanup" purposes, but also to prevent us accidentally reintroducing code that may concurrently call Other than the noted race condition, there is nothing else I'm worried about in this code at this time. I agree that >1 thread pool is unlikely to hit the case I noted. The +1 for customer deployment with this branch. Agree with @matthewvon about fixing race condition for 2.0. |
issue addressed in mv-hot-threads2 |
Note: spin locks taken from earlier PR already approved for Riak 2.0, and hot_threads taken eleveldb async threads of Riak 1.3.