Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement reference types for .NET embeddings. #27

Closed
peterhuene opened this issue Jul 17, 2020 · 18 comments · Fixed by #29
Closed

Implement reference types for .NET embeddings. #27

peterhuene opened this issue Jul 17, 2020 · 18 comments · Fixed by #29
Assignees
Labels
enhancement New feature or request

Comments

@peterhuene
Copy link
Member

Now that the reference types proposal is implemented in Wasmtime, let's implement it for .NET embeddings!

This means fully supporting the ref values, ref value types, handling reference types in host functions, handling reference types when invoking Wasm functions, and properly rooting .NET GC heap references when passing in and out of WebAssembly.

@peterhuene peterhuene self-assigned this Jul 17, 2020
@peterhuene peterhuene added the enhancement New feature or request label Jul 17, 2020
@peterhuene
Copy link
Member Author

Once this is implemented, we can release a 0.19.0 NuGet package.

@fitzgen
Copy link
Member

fitzgen commented Jul 17, 2020

FWIW, here is what it took to do this in the python API: bytecodealliance/wasmtime-py@b8532a4

We don't really integrate with the python reference counting collector, we just manually pin data that is referenced by an externref in a global map.

@peterhuene
Copy link
Member Author

peterhuene commented Jul 17, 2020

It'll be something similar for .NET, although we won't need to do any mapping ourselves.

We'll allocate a GCHandle that will act as the extern ref's value; the finalizer will free the GCHandle, allowing for collection provided the .NET object is no longer reachable. The .NET object won't need to be pinned as the GCHandle can track it and WebAssembly never needs the physical address of the object on the .NET GC heap.

@peterhuene
Copy link
Member Author

peterhuene commented Jul 18, 2020

I've got extern refs working for .NET embeddings, but I'm noticing that the finalizer isn't being invoked:

  • Host calls wasmtime_externref_new_with_finalizer, the internal ref count should be 1 for the returned value.
  • Pass reference into Wasm function (reference not stored in a table or global, simply passed to host function).
  • Host calls wasm_val_delete on the reference value. This should be driving the reference count to 0 and thus invoking the finalizer.

I haven't debugged it yet, so I'll take a look on Monday.

@peterhuene
Copy link
Member Author

peterhuene commented Jul 20, 2020

So I'm seeing 5 increments and 4 decrements between a wasm_func_call invocation passing a ref value:

Creating extern ref with `wasmtime_externref_new_with_finalizer`.
Calling function...
Extern ref count: 1 -> 2
Extern ref count: 2 -> 3
Extern ref count: 3 -> 4
Extern ref count: 4 -> 5
Extern ref count: 5 -> 6
Extern ref count: 6 -> 5
Hello from C#, Peter!
Extern ref count: 5 -> 4
Extern ref count: 4 -> 3
Extern ref count: 3 -> 2
Wasm function returned...
Deleting previously created value with `wasm_val_delete`.
Extern ref count: 2 -> 1

The code:

using var engine = new EngineBuilder()
    .WithReferenceTypes(true)
    .Build();

using var module = Module.FromText(engine, "externref", @"
(module
  (import """" ""hello"" (func $.hello (param externref)))
  (func (export ""run"") (param externref)
    local.get 0
    call $.hello
  )
)");

using var host = new Host(engine);

host.DefineFunction(
    "",
    "hello",
    (string name) => Console.WriteLine($"Hello from C#, {name}!")
);

using dynamic instance = host.Instantiate(module);
instance.run("Peter");

And the value is thereby "leaked" (actually held on to by Wasmtime). I'm capturing the ref count stacks now.

@peterhuene
Copy link
Member Author

peterhuene commented Jul 20, 2020

increment: /home/peterhuene/src/wasmtime/crates/c-api/src/func.rs:213 (_wasmtime_func_call)
increment: /home/peterhuene/src/wasmtime/crates/wasmtime/src/func.rs:597 (Func::call)
increment: /home/peterhuene/src/wasmtime/crates/wasmtime/src/func.rs:279 (Func::new)
increment: /home/peterhuene/src/wasmtime/crates/c-api/src/func.rs:91 (create_function)
increment: /home/peterhuene/src/wasmtime/crates/c-api/src/ref.rs:129 (wasmtime_externref_data)

decrement: /home/peterhuene/src/wasmtime/crates/c-api/src/ref.rs:142 (wasmtime_externref_data)
decrement: /home/peterhuene/src/wasmtime/crates/c-api/src/func.rs:105 (create_function)
decrement: /home/peterhuene/src/wasmtime/crates/wasmtime/src/func.rs:311 (Func::new)
decrement: /home/peterhuene/src/wasmtime/crates/c-api/src/func.rs:247 (_wasmtime_func_call)

Looks like there's no matching decrement for the clone in Func::call. Still digging.

@peterhuene
Copy link
Member Author

peterhuene commented Jul 20, 2020

I see, I didn't expect the call arguments to end up in the activations table as the reference value is always rooted by the caller to wasm_func_call. This is problematic as there is no way to manually trigger the GC from the C API (afaik), not that we would want to do it after every call to wasm_func_call anyway.

@fitzgen thoughts on the above?

@peterhuene
Copy link
Member Author

peterhuene commented Jul 21, 2020

I deleted the previous comment regarding a leak on dropping the store; there was actually a reference to the store alive from the .NET side because a Dispose call was missing (DefineFunction returns a disposable object, thus the store was only dropping if a .NET GC occurred). When the store drops, the extern ref is finalized as expected.

@fitzgen
Copy link
Member

fitzgen commented Jul 21, 2020

We can add a wasmtime.h function to do GC, or you can pass in a bunch of references into Wasm in a loop until GC is triggered.

FWIW, I didn't test GC reclaimation in wasmtime-py, since we have some tests for that in Rust, but on further reflection it probably does make sense to test the interaction with host objects and finalization.

@fitzgen
Copy link
Member

fitzgen commented Jul 21, 2020

Also, should StoreInner::drop do a GC?

I ask because the extern ref is definitely leaked even if it's in the activation table when the store gets dropped by the host.

This is interesting, I would expect that it would be cleaned up via the activations table running Drops on its elements. I'll see if I can make a Rust-only reproducer for this.

Oh nevermind, just saw your update.

@fitzgen
Copy link
Member

fitzgen commented Jul 21, 2020

Here is a PR to add a C API to do GC: bytecodealliance/wasmtime#2052

@peterhuene
Copy link
Member Author

peterhuene commented Jul 21, 2020

For my understanding, why are wasm_func_call extern ref arguments put in the activation table? Given the extern refs don't end up in a table or a global in this case, it is a little surprising to me that Wasmtime itself would keep a reference alive independently of the target Wasm code.

I think exposing a GC-triggering method in the .NET API and letting the users manually collect if desired (they'll know best when to do so given their use case) make the most sense.

@fitzgen
Copy link
Member

fitzgen commented Jul 21, 2020

For my understanding, why are wasm_func_call extern ref arguments put in the activation table? Given the extern refs don't end up in a table or a global in this case, it is a little surprising to me that Wasmtime itself would keep a reference alive independently of the target Wasm code.

Yeah, this is due to the deferred reference counting for references inside Wasm frames, where we don't actually increment/decrement the ref counts. Instead we over-eagerly keep them alive until we do a GC, when we find the precise set that need to be alive, and reclaim the references that were in the over approximation. See https://github.com/bytecodealliance/wasmtime/blob/main/crates/runtime/src/externref.rs#L61-L100 for more details.

@peterhuene
Copy link
Member Author

peterhuene commented Jul 21, 2020

The GC implementation makes sense. I guess what I'm getting at is that it would be nice if there were a mechanism to determine if the arguments need to end up in the activation table for a call to wasm_func_call.

In this particular use case, there are no wasm frames on the stack when wasm_func_call is invoked. As a result, any extern ref arguments are guaranteed to be rooted by the caller as wasm_func_call doesn't take ownership of the values.

Indeed, the only way to call wasm_func_call with an extern ref that needs to be in the activation table would be a call from an imported host function accepting an extern ref. But if the host function is passed an extern ref that wasn't an argument to the "bottom most" wasm_func_call on the stack (i.e. one which has no wasm frames underneath it) then it must have come from a table or global get and thus would be in the activation table already, no?

@peterhuene
Copy link
Member Author

peterhuene commented Jul 21, 2020

My very-surface-level understanding of your GC implementation aside, it isn't a big deal to let the .NET users know that if they pass .NET GC heap objects into Wasm then they'll always be kept alive until a Wasmtime GC is triggered, either automatically or manually.

@fitzgen
Copy link
Member

fitzgen commented Jul 21, 2020

Yeah, if the reference is going to outlive the call, then we could do that optimization. This does mean that if the Wasm doesn't need the reference after some point, and the host doesn't actually need to use the reference again, then it can't be reclaimed eagerly at the next GC.

With the Func::wrap/Func::get Rust APIs, we move the references into the call, passing ownership to Wasm. This means that references need to go into the activations table, but also that we can eagerly reclaim references during GC even when the call hasn't returned.

I'm sure the trade off favors one approach or another depending on the specifics.

@fitzgen
Copy link
Member

fitzgen commented Jul 21, 2020

Filed bytecodealliance/wasmtime#2056 to investigate taking advantage of this

@peterhuene
Copy link
Member Author

Sorry, I was basically just repeating myself there for no reason, so I deleted my last comment 😅. Thanks for filing the issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants