Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make wasm.HostFunctionCallContext an interface #204

Closed
codefromthecrypt opened this issue Feb 5, 2022 · 11 comments · Fixed by #246
Closed

Make wasm.HostFunctionCallContext an interface #204

codefromthecrypt opened this issue Feb 5, 2022 · 11 comments · Fixed by #246

Comments

@codefromthecrypt
Copy link
Contributor

Right now, someone can accidentally mess up the memory of the host function, I think both the context and its members should be accessed via interfaces, which can provide functions to manipulate memory instead of direct buffer access. This will ensure bounds etc aren't violated.

Particularly this means not exporting Memory.Buffer as Memory is becomes an interface and it has interface functions to perform common tasks.

@mathetake
Copy link
Member

+1

@mathetake
Copy link
Member

One thing we have to care about is that the cost of accessing via interface vs just unexporting memory instance field

@mathetake
Copy link
Member

As each host call is equivalent to making system calls so people want them as fast as possible!

@codefromthecrypt
Copy link
Contributor Author

codefromthecrypt commented Feb 5, 2022

I guess in any case there will be an unexported field, so what you are saying is maybe we leave it as a struct with exported functions to manipulate mem vs define those same functions in an interface?

ex.

	MemoryInstance struct {
		buffer []byte // inexpert only this
		Min    uint32 // PS especially below would make sense to also not export as export makes it mutable
		Max    *uint32
	}

func (m *MemoryInstance) PutUint32(addr uint32, val uint32) bool {
--snip--

vs defining and matching an interface

type MemoryInstance interface {
  PutUint32(addr uint32, val uint32) bool
  Min() uint32
  Max() *uint32
}

	memoryInstance struct {
--
	}

func (m *memoryInstance) PutUint32(addr uint32, val uint32) bool { // matches 
--snip--

That we expect the type dispatch to drop performance in a way we should care about. Possible! and luckily benchmark able. Meanwhile our own code can avoid that type dispatch as we can also choose to export it, but put it in an internal package. So the main thing is if user-code using an interface is slow enough to never do that, not here or anywhere else.

Do I get it right?

@codefromthecrypt
Copy link
Contributor Author

I guess one point I mean to say is that this is about the user perspective. Anything we do for speed internally is possible thankfully for go's internal special package. Since we won't allow alternate implementations to be returned, we can guarantee what type is used and make sure that internally it is always specific, and so our jit can use that type and not use the interface

@codefromthecrypt
Copy link
Contributor Author

Looking at #205 sounds helpful to prefer structs over interfaces internally when we are doing a lot of dispatch, as it can save 1.5ns due to the extra pointer indirection.

OTOH for what we refer to as the public API, I think interfaces are better as it prevents field mutability and generally reduces the surface area of bugs such as we've gotten lately.

So, basically I suggest we:

  • move scary mutable memory types to an internal package
  • create a replacement interface wasm.HostFunctionCallContext and add a compile-check that internal types implement it
  • add rationale for why we use internal types directly even if we have an interface (saving the 1.5ns/pointer indirection)

Ex.

type HostFunctionCallContext  interface { // implemented by internal/host/HostFunctionCallContext or something
  Context()
  Mem()
}

If there are cyclic dep issues, we'd work out the kinks in a PR. SG?

@nullpo-head
Copy link
Contributor

nullpo-head commented Feb 7, 2022

The suggestion of separating the internal package and the interface sounds good to me. Just to confirm, Mem() returns the MemoryInstance interface, which has bound-checking manipulation APIs such as PutUint32, right?

@codefromthecrypt
Copy link
Contributor Author

the problem is that this api is not only used by code we review. It is an external api. You won't be tagged on PRs that implement host functions, so setting up a delicate way for things to sometimes work if we look carefully.. that only works when we are looking

@codefromthecrypt
Copy link
Contributor Author

@nullpo-head yeah basically the object graph starting at HostFunctionCallContext contains other interfaces or immutable references to primitives or slices. mutation operations internally check all the things. probably I will raise a PR soon as talk is cheap and I talk too much

@codefromthecrypt
Copy link
Contributor Author

WIP: but we also need to be able to read back also, particularly for filesystem APIs. Will update in the next day or two covering all known use cases

type Memory interface {
	// PutUint32 writes a uint32 value to the underlying buffer at the offset or returns false if out of range.
	PutUint32(offset uint32, val uint32) bool

	// PutUint64 writes a uint64 value to the underlying buffer at the offset or returns false if out of range.
	PutUint64(offset uint32, val uint64) bool

	// PutBytes writes the slice to the underlying buffer at the offset or returns false if out of range.
	PutBytes(offset uint32, val []byte) bool
}

codefromthecrypt pushed a commit that referenced this issue Feb 15, 2022
This adds two interfaces with only one implementation each, used to
decouple end-user code from internals.

* api.HostFunctionCallContext - allows access to go Context and Memory
* api.Memory - read/write standard types and arbitrary data

This currently ports existing code to use the interfaces. It is true
that over time we can use structs instead of interfaces internally to
save a pointer reference in WASI call sites. However, I'm choosing not
to do this right now, as it is more important to see the interfaces
work. We should defer optimizing towards structs until after we
repackage implementations under the "internal" directory.

Fixes #204

Signed-off-by: Adrian Cole <adrian@tetrate.io>
@codefromthecrypt
Copy link
Contributor Author

#246 will fix this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants