-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NRG: Wait for goroutines to shutdown when recreating group #5832
Conversation
9613916
to
01a15f8
Compare
01a15f8
to
0ec0f43
Compare
61b9a2e
to
3cc6288
Compare
3cc6288
to
751dec0
Compare
Have rebased, this one is ready for review. |
server/raft.go
Outdated
// Wait for goroutines to shut down before trying to clean up | ||
// the log below, otherwise we might end up deleting the store | ||
// from underneath running goroutines. | ||
n.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have these patterns for sure elsewhere but I try hard to avoid them if at all possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can avoid it here, it's the only way we can potentially avoid a deadlock if the runAs
goroutine is finishing up processing some operations.
03b9f1d
to
4a02a47
Compare
566c9cc
to
dcd8167
Compare
server/raft.go
Outdated
@@ -1728,12 +1731,17 @@ func (n *raft) Stop() { | |||
n.shutdown(false) | |||
} | |||
|
|||
func (n *raft) WaitForStop() { | |||
n.wg.Wait() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we check that the state is set to stopped to avoid calling in here and never returning?
server/raft.go
Outdated
@@ -1744,12 +1752,12 @@ func (n *raft) shutdown(shouldDelete bool) { | |||
if n.state.Swap(int32(Closed)) == int32(Closed) { | |||
// If we get called again with shouldDelete, in case we were called first with Stop() cleanup | |||
if shouldDelete { | |||
n.wg.Wait() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we hold the lock here this may never return.
This should fix some logical races where multiple sets of goroutines can fight over the same store directory, i.e. when shutting down and recreating a group rapidly. Signed-off-by: Neil Twigg <neil@nats.io>
Signed-off-by: Neil Twigg <neil@nats.io>
dcd8167
to
d2ed0d9
Compare
3624123
to
dc65149
Compare
Signed-off-by: Neil Twigg <neil@nats.io>
dc65149
to
5622199
Compare
I have shuffled things around a bit:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This should fix some logical races where multiple sets of goroutines can fight over the same store directory, i.e. when shutting down and recreating a group rapidly.
Signed-off-by: Neil Twigg neil@nats.io