-
Notifications
You must be signed in to change notification settings - Fork 532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor CallbackStack
based on new "pack locking" strategy
#3943
Refactor CallbackStack
based on new "pack locking" strategy
#3943
Conversation
Co-authored-by: Matthias Ernst <mernst-github@mernst.org>
Co-authored-by: Sam Pillsworth <sam@blerf.ca> Co-authored-by: Matthias Ernst <mernst-github@mernst.org>
Co-authored-by: Sam Pillsworth <sam@blerf.ca>
…pt' into topic/callback-stack-pack-lock
Co-authored-by: Sam Pillsworth <sam@blerf.ca> Co-authored-by: Matthias Ernst <mernst-github@mernst.org>
Co-authored-by: Sam Pillsworth <sam@blerf.ca>
Co-authored-by: Sam Pillsworth <sam@blerf.ca>
Co-authored-by: Sam Pillsworth <sam@blerf.ca>
def apply(a: A): Boolean = { | ||
// TODO should we read allowedToPack for memory effect? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As explained above, it's not strictly necessary, but if we did read allowedToPack
then at least some of the pack
ing would be guaranteed to be visible. But I'm not sure if this additional synchronization is worth it: the modifications may already be visible fortuitously and we still need to be prepared for concurrent pack
ing anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I didn't consider concurrent pack
and apply
. If there were any concerns regarding correctness (boils down to: is it ok for pack
to rewrite the edges pointing to already published data without synchronization?), could even consider spinning on the allowedToPack
lock and own the list during apply
? That might even amortize because apply won't take the packed "detour"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could even consider spinning on the
allowedToPack
lock and own the list duringapply
Sam and I considered this idea but decided against it. We could be spinning for an unbounded time while we wait for pack
to complete iterating over a very long list. Or what if we lose the race and another pack
acquires the lock? For sure, these are worst case pathologies, but since owning the lock is not required to safely iterate the list, spinning until we acquire doesn't seem worth the risk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say yes, read allowedToPack
here, because: what will we do here? We'll call all the cb
s in the list, right? Which are cont's resume
s, so when we call them, each will do 1-2 CASes. If we can avoid calling some of them (the cleared ones) by doing a volatile read, that seems like a win. (Of course we might already not call them, because we might already see that they're null... so I'm not sure actually...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a really interesting point :) it's essentially gambling:
- we do a volatile read now, but get a pretty good guarantee we won't do unnecessary CASes
- we take our chances and if we are lucky we got away with doing the minimum synchronization necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cleared ones are nulled out though, so we shouldn't have any penalty there besides the node read, or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cleared ones are nulled out though
Yes, but currently there is no guarantee that this is published either. They are ordinary writes/reads.
/** | ||
* Nulls all references in this callback stack. | ||
*/ | ||
def clear(): Unit = { | ||
callback = null | ||
head.lazySet(null) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we need to be more aggressive about iterating the list and null
ing all the callbacks and next
s. The concern is if someone holds onto a Handle
for cancelation, they could actually be keeping the entire structure in memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's a valid concern. Maybe we could do that during apply
? It's a little strange, because apply
and clear
are separate, but a call to clear
always follows apply
, doesn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a little strange, because
apply
andclear
are separate, but a call toclear
always followsapply
, doesn't it?
I agree. I'd be happy to see clear
go away and we do the clearing as part of apply
.
Having a separate clear
seems important here. This code surprised me.
cats-effect/core/shared/src/main/scala/cats/effect/IOFiber.scala
Lines 1075 to 1084 in 0207637
try { | |
if (!callbacks(oc, false) && runtime.config.reportUnhandledFiberErrors) { | |
oc match { | |
case Outcome.Errored(e) => currentCtx.reportFailure(e) | |
case _ => () | |
} | |
} | |
} finally { | |
callbacks.clear() /* avoid leaks */ | |
} |
I think the only callback that can throw is one installed at construction e.g. via IO#unsafeRunAsync
. But it seems better to try
/catch
where the callback is defined, and only install non-throwing callbacks in IOFiber
. It especially strikes me as odd if the other callbacks are not evaluated, because the first one threw.
However, these changes seem involved enough to warrant their own PR ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Just a couple of thoughts:
def apply(a: A): Boolean = { | ||
// TODO should we read allowedToPack for memory effect? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I didn't consider concurrent pack
and apply
. If there were any concerns regarding correctness (boils down to: is it ok for pack
to rewrite the edges pointing to already published data without synchronization?), could even consider spinning on the allowedToPack
lock and own the list during apply
? That might even amortize because apply won't take the packed "detour"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love this. Some trivial changes and a couple simple questions but otherwise this is great.
tests/shared/src/test/scala/cats/effect/CallbackStackSpec.scala
Outdated
Show resolved
Hide resolved
tests/shared/src/test/scala/cats/effect/CallbackStackSpec.scala
Outdated
Show resolved
Hide resolved
tests/shared/src/test/scala/cats/effect/CallbackStackSpec.scala
Outdated
Show resolved
Hide resolved
tests/shared/src/test/scala/cats/effect/CallbackStackSpec.scala
Outdated
Show resolved
Hide resolved
Looks like a perfect wash (within margin for error) in terms of performance. Good to go on that front! Before
After
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still leaning toward rolling the dice on not fencing in apply
. Let's see how that treats us though, and if we're seeing problems there, we'll add the fence.
The merge-base changed after approval.
Fixes #3940. Very fun collaboration with @samspills but credit to @mernst-github for proposing the idea of guarding
pack
in #3940 (comment).The new callback stack is made up of two data types:
CallbackStack.Node
, a singly-linked list of callbacksCallbackStack
itself, which holds:AtomicReference
to the headNode
, andAtomicBoolean
lockallowedToPack
, used for guardingpack
Pushing to the stack is a straighforward CAS of the head.
pack
is the only method that modifies the list.AtomicBoolean
lock guarantees that only onepack
operation is running at a time, and is also used to publish the modifications to subsequent packspack
is unable to acquire the lock, it gives up immediatelyInvoking the callbacks is not guarded by the lock. This means that the modifications by
pack
may not be visible, and in factpack
ing can even occur concurrently while iterating over the list. This is actually okay:Node
s, since additions happen by CASing the pointer to the headNode
Node
that was actually removed, we take a "detour" but do eventually visit all theNode
s in the list