-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fast effect forwarding #256
Conversation
/// contref = chain_link.get_contref() | ||
/// parent_link = contref.parent | ||
/// parent_csi = parent_link.get_common_stack_information(); | ||
/// handlers = parent_csi.handlers; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem right to me. The handlers should sit on the current continuation reference. If the current one doesn't handle the suspension, then we try the parent, and so forth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if I understand what you are saying.
Consider the following example:
...
^
| parent
c1
^
| parent
c2
where $c1
executed (resume $ct (on $t switch) ... (local.get $c2))
and $c2
is currently executing.
In the current design in this PR, continuation $c1
would have a handler list l with a switch
entry for $t
(as well as all other tags potentially occurring in the resume
above).
Continuation $c2
is currently running and has no handler list.
Are you suggesting that the handler list should be attached to $c2
instead, meaning that each continuation stores what is handled by its immediate parent?
The disadvantage of that approach over the current one is that it requires additional work on switch
instructions:
In the example above, if $c2
then executes (switch $ct' $t (local.get $c3))
for some $c3
, we would have to update the handler lists such that l becomes the handler list of $c3
(and the handler list of $c2
is cleared for housekeeping).
This work isn't required if the handler list l is associated with $c1
instead (and stays there on switch).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are right about querying the parent first. We do that in all our implementations, I see. On reflection, it must be like that, because the handler can grow its continuation, and when resuming a captured continuation it needs to be spliced with the handler's continuation.
Regarding switch
and clearing the handlers on the continuation reference, I think the only thing that needs to happen for $c3
is that it has its parent adjusted (similar for the current continuation). We should never have to update the handlers of "switched" continuations, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding
switch
and clearing the handlers on the continuation reference, I think the only thing that needs to happen for$c3
is that it has its parent adjusted (similar for the current continuation). We should never have to update the handlers of "switched" continuations, I think.
Yes, this is what the design in this PR ensures. You don't have to touch the handlers on switch
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The semantics of switch
is similar to the semantics of a deep handler, in the sense that the handler stays in place after resume. So any proper design should not have to modify the handler on switch
.
let raw_parts = builder.block_params(handle_link); | ||
let chain_link = | ||
tc::StackChain::from_raw_parts([raw_parts[0], raw_parts[1]], env.pointer_type()); | ||
let is_main_stack = chain_link.is_main_stack(env, builder); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't like this special case of the main stack. But we can fix that in another patch. Ideally, every stack is homogeneous.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no "special case for the main stack" here. We are traversing a linked list, whose last element is the main stack.
The only thing that is_main_stack
is used for is detecting that we have reached the end of the list without finding a handler.
You could rename is_main_stack
to has_no_parent
. Such a check will always be required in this traversal.
The rest of the code treats continuations and the main stack is completely homogeneously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comment was conceptual. It is not about this particular code. We shouldn't have a notion of a main stack at all, is my point.
|
||
// Note that the control context we use for switching is not the one in | ||
// (the stack of) resume_contref, but in (the stack of) last_ancestor! | ||
let fiber_stack = last_ancestor.get_fiber_stack(env, builder); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the point of this "last ancestor" business. Please explain. See my other comment below.
/// | +----------------+ | ||
/// | ^ | ||
/// | | parent | ||
/// last ancestor | | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so reading this drawing I may understand what you are on about with the ancestor business. What you call last_ancestor
is really the initial frame of the delimited continuation that is being captured. I think we should name all this differently. The frame is not the "last" ancestor, as it may be adopted by another ancestor etc. However, it is the tail of the linked list that defines the continuation. Thus, I think it is better to use names such as "tail"/"head" to describe these objects. The data structure you are building is known as a "circular list". I think it is worthwhile to adopt this terminology to make the implementation easier to understand and more readily accessible to other people.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not particularly married to the name last_ancestor
. However, this field is only ever set on the "head" continuation (bottom-most continuation in the picture) and only if it is suspended. Thus, if its set, it indeed points to the last ancestor.
I'm not particularly keen on the term tail
, since it is somewhat overloaded in the context of linked lists. It is both used to refer to the next node in a linked list, but also to refer to the end of the list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am happy to consider anything but last ancestor. Am I correct that the continuation object points to the bottom stack in the chain (i.e. the stack that executed the suspension)? If so, then I think we can optimise it ever-so-slightly but returning the "last ancestor" instead and set its parent to be the bottom stack. We should save a field that way too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's possible, but I would prefer holding off from that. The reason is that it complicates how at suspend
, we propagate the required information to the resume
site.
In the current version of this PR, the following happens:
- We identify the currently running continuation
c1
by reading the stack chain field of theVMContext
. This will be the bottom of the continuation chain of the continuation object we are about to construct. - The handler search gives us the other end of the chain,
c2
. We setc1.last_ancestor
toc2
. - We perform the stack switch.
- Back in
resume
, we readc1
from theVMContext
(it's still saved in the stack chain field) and obtain the other end (c2
) from itslast_ancestor
field.
In the design you propose, we would instead make c2
point to c1
in step 2. But this means that in the 4th step above, we have no way to obtain c2
. We would have to either temporarily write c2
into the VMContext
or extend the switch
instruction so that we can pass more payloads. I'd very much prefer avoiding that, at least for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh, it strikes again. I understand. We really need to get rid of the auxiliary structures. We should be able to encode everything in one chain. Certainly we need this before/for the upstreaming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue I described is orthogonal from the details of how we represent the chain of currently active continuations. In any design, there will always be a field in the VMContext
that points to the currently active continuation. That's the field accessed in step 1 and 4 above.
In that sense, the design in this PR works without requiring any new channels (in the widest sense) to get information from the suspend
to the resume
site.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not convinced. To me the current design/implementation has a certain spaghetti feeling to it. But it can be refactored later. Clarification: I am not talking about this patch, but our entire implementation. It is too convoluted in my opinion. It is an obvious consequence of how we have organically evolved the implementation too. I am positive a refactoring will reveal a much simpler design/implementation.
All this list business, reminds me of some tricks we should try. One trick used to speed up linked list searching is the "move to front" trick, where once an element has been found it is moved to the front of the list, such that next time you look for it, it may be one of the first elements. Obviously, we don't want exactly that, but a variant of it may speed up continuation materialisation even further, naming, what we could have on each continuation reference is a pointer to the most recent suspension handler (like caching the latest handler). |
I guess we can merge this as-is. |
I'm fine with that, but would you still prefer renaming the |
I don't think it is necessary now. I was just commenting that I think we should come up with a better name for when we upstream it. |
This PR provides the missing piece to support `switch` instructions, by adding the necessary codegen (but only for the optimized implementation, support in the baseline implementation is not included). Thus, the main change is the addition of `wasmtime_cranelift::wasmfx::optimized::translate_switch`. In general, the implementation piggybacks on #256: On `resume`, we now fill the `HandlerList` with _m_ entries for all tags with suspend handlers, followed by _n_ entries for all tags with switch handlers. The `search_handler` code is changed so that on `suspend` and `switch`, we look in the correct part of the `HandlerList`. We could use two separate `HandlerLists` instead, but then we would have yet another allocation to manage, so putting both kinds of tags into the same list, and then only searching part of it seems preferable. A few more notes: - `translate_resume` and `translate_switch` now return a `WasmResult`. This is used so that the baseline implementation can bail out without needing to `panic`. - The test runner in `stack_switchting.rs` now takes an extra parameter that allows us to enable the gc proposal, which is required for the tests using recursive types. --------- Co-authored-by: Daniel Hillerström <daniel.hillerstrom@ed.ac.uk>
This PR fundamentally changes how we handle tag/effect forwarding: Currently, at any
suspend
site, we unconditionally attempt to switch to the parent stack (or fail, if we are already on the the main stack). In the process, we pass the tag identifier that we suspended with. Then, we check if theresume
clause in the parent had a handler for that tag. If it doesn't, we attempt to suspend again to the parent of the (original) parent. This process is repeated until we either find a handler, or reach the main stack (meaning that there was no appropriate handler).The new design introduced by this PR works as follows: Every stack (main stack or continuation) carries a
HandlerList
around. This is effectively a vector of tag identifiers (i.e.,*mut VMTagDefinition
).On
resume $ct (on $tag_1 $block_1) ... (on $tag_n $block_n)
, we then fill the active stack's HandlerList with the following n entries:On
suspend $tag
, we then traverse the chain of active continuations, inspecting theHandlerList
of the parents of the currently active continuation. If we find the address of$tag
as the _i_th entry of the handler list of someresume
instruction in our parents, we directly switch to the stack that executed the corresponding resume instruction. In the process, we pass along the list index i. Back at theresume
site, we can then directly jump to the _i_th handler block, using a jump table.If the handler search does not succeed, we immediately trap at the
suspend
site.Some implementation notes:
wasmtime_cranelift::wasmfx::optimized::search_handler
, which is used for emitting code atsuspend
sites.translate_resume
, it is actually simpler now! The rather annoying back edge from the forward block to the resume block is gone, and every block will at most be executed once.ControlEffect
is no longer usingTaggedPointer
. There are simply no more pointers involved here at all, the only payload to hand from suspend site to resume site (in addition to the return vs suspend signal) is the index of the handler we want to use.In general, everything is built to be compatible with
switch
being added on top of this design: In particular, theHandlerList
and aforementionedsearch_handler
can easily be adapted to supportswitch
, and the changes totranslate_resume
are made withswitch
in mind.