-
Notifications
You must be signed in to change notification settings - Fork 530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix a few Supervisor bugs #3972
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ouch this is hairy. Nice work. :) Some minor nits.
class SupervisorSpec extends BaseSpec { | ||
class SupervisorSpec extends BaseSpec with DetectPlatform { | ||
|
||
sequential |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With lowering the iteration counts, I think we will be able to avoid this.
|
||
sequential | ||
|
||
override def executionTimeout = 60.seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oooof. Can we lower iteration counts instead?
case ex: IllegalStateException | ||
if ex.getMessage == "supervisor already shutdown" => | ||
None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be a bit easier to diagnose an eventual failure here if we convert this into a side-effecting test assertion.
case ex: IllegalStateException | |
if ex.getMessage == "supervisor already shutdown" => | |
None | |
case ex: IllegalStateException => | |
ex.getMessage mustEqual "supervisor already shutdown" | |
None |
IO.sleep(100.millis) *> adaptedFiber.cancel *> adaptedFiber.join *> ( | ||
(counter.get, IO.sleep(100.millis) *> counter.get).flatMapN { | ||
case (v1, v2) => | ||
IO((v1: Int) mustEqual (v2: Int)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the type ascriptions needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, I don't even remember why I did those 🙂
} | ||
} | ||
|
||
tsk.parReplicateA_(if (isJVM) 10000 else 5).as(ok) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to replicate it at all on non-JVM platforms?
Also, I'm assuming this is the test that takes most of the time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point; I think I copied the if (isJVM) ... else 5
part from somewhere.
For me, it was usually supervise / finalize race with checkRestart
that timed out.
|
||
private[this] val allFibers: F[List[Fiber[F, Throwable, _]]] = | ||
stateRef.get.map(_.values.toList) | ||
stateRef.getAndSet(null).map(_.values.toList) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So null
is the signal that we've decided to kill ourselves? Basically, the invariant here is that joinAll
/cancelAll
can only be called at the end of the world, when the Supervisor
is shutting down. Can we toss that in a comment so we don't forget it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is already a comment in abstract class State
about joinAll
/cancelAll
, but I'll note it here too.
Also it occurs to me that, with this, we might not need the double-check on closure state in #3923 |
FTR there was a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome!
This PR fixes a few bugs in
Supervisor
(and creates an unknown number of new ones ;-).Fixed bugs (at least the ones I remember spotting; I tried to write tests for them, couldn't for all of them):
Supervisor
strangeness #3394checkRestart
,cancel
allowed and didn't wait for one last restart.supervise
/close race (although, due to ClosedSupervisor
strangeness #3394, close wasn't effective anyway).checkRestart
, ifcancel
cancelled the working fiber before it did anything, the finalizer (which removes the fiber from the map) did not run. (This was effectively a memory leak.)