Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add wasm32-wasip2 support #1836

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

add wasm32-wasip2 support #1836

wants to merge 2 commits into from

Conversation

dicej
Copy link

@dicej dicej commented Oct 9, 2024

This implementation currently uses a mix of POSIX-style APIs (provided by wasi-libc via the libc crate) and WASIp2-native APIs (provided by the wasi crate).

Alternatively, we could implement Selector using only POSIX APIs, e.g. poll(2). However, that would add an extra layer of abstraction to support and debug, as well as make it impossible to support polling wasi:io/poll/pollable objects which cannot be represented as POSIX file descriptors (e.g. timer events, DNS queries, HTTP requests, etc.).

Another approach would be to use only the WASIp2 APIs and bypass wasi-libc entirely. However, that would break interoperability with both Rust std and e.g. C libraries which expect to work with file descriptors.

Since wasi-libc does not yet provide a public API for converting between file descriptors and WASIp2 resource handles, we currently use a non-public API (see the netc module below) to do so. Once
WebAssembly/wasi-libc#542 is addressed, we'll be able to switch to a public API.

I've tested this end-to-end using https://github.com/dicej/wasi-sockets-tests, which includes smoke tests for mio, tokio, tokio-postgres, etc.

This implementation currently uses a mix of POSIX-style APIs (provided by
`wasi-libc` via the `libc` crate) and WASIp2-native APIs (provided by the `wasi`
crate).

Alternatively, we could implement `Selector` using only POSIX APIs,
e.g. `poll(2)`.  However, that would add an extra layer of abstraction to
support and debug, as well as make it impossible to support polling
`wasi:io/poll/pollable` objects which cannot be represented as POSIX file
descriptors (e.g. timer events, DNS queries, HTTP requests, etc.).

Another approach would be to use _only_ the WASIp2 APIs and bypass `wasi-libc`
entirely.  However, that would break interoperability with both Rust `std` and
e.g. C libraries which expect to work with file descriptors.

Since `wasi-libc` does not yet provide a public API for converting between file
descriptors and WASIp2 resource handles, we currently use a non-public API (see
the `netc` module below) to do so.  Once
WebAssembly/wasi-libc#542 is addressed, we'll be able
to switch to a public API.

I've tested this end-to-end using https://github.com/dicej/wasi-sockets-tests,
which includes smoke tests for `mio`, `tokio`, `tokio-postgres`, etc.

Signed-off-by: Joel Dice <joel.dice@fermyon.com>
dicej added a commit to dicej/tokio that referenced this pull request Oct 9, 2024
This adds support for the new `wasm32-wasip2` target platform, which includes
more extensive support for sockets than `wasm32-wasip1` (formerly known as
`wasm32-wasi`).

The bulk of the changes are in tokio-rs/mio#1836.  This
patch just tweaks a few `cfg` directives to indicate `wasm32-wasip2`'s
additional capabilities.

In the future, we could consider adding support for `ToSocketAddrs`.  WASIp2
natively supports asynchronous DNS lookups and is single threaded, whereas Tokio
currently assumes DNS lookups are blocking and require multithreading to emulate
async lookups.  A WASIp2-specific implementation could do the lookup directly
without multithreading.

I've tested this end-to-end using https://github.com/dicej/wasi-sockets-tests,
which includes smoke tests for `mio`, `tokio`, `tokio-postgres`, etc.  I'd also
be happy to add tests to this repo if appropriate; it would require adding a
dev-dependency on e.g. `wasmtime` to actually run the test cases.

Signed-off-by: Joel Dice <joel.dice@fermyon.com>
@dicej
Copy link
Author

dicej commented Oct 9, 2024

I just realized this PR doesn't include support for accepting incoming connections -- only initiating outgoing ones. I forgot about the former since I don't personally have an urgent need to support it, but it could be added as a follow-up PR. To be clear: WASIp2 is fully capable of handling that case.

Comment on lines +242 to +247
let mut push_event = || {
events.push(Event {
token: subscription.token,
interests: *interests,
})
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just out of curiosity, if there any reason to do that in a closure vs. doing it inline at L276, L289 and L297?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No particular reason other than avoiding the repetition.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright! Was not sure if it was DRY or there was a mechanism I am not aware of, thanks for clarification 🙏

@raskyld
Copy link

raskyld commented Oct 9, 2024

Also, I have a last clarifying question: reading wasi-libc I assumed that all Pollable are abstracted as a Socket but this socket is a "fake" one in the sense that we only use it to convey the "readiness" of the underlying Pollable irrespectively of whether it is actually obtained through wasi:sockets.

But your comment:

I just realized this PR doesn't include support for accepting incoming connections -- only initiating outgoing ones. I forgot about the former since I don't personally have an urgent need to support it, but it could be added as a follow-up PR. To be clear: WASIp2 is fully capable of handling that case.

Actually speak of connections, so I am not sure my previous assumption was correct.

Anyway, great job 💪

@dicej
Copy link
Author

dicej commented Oct 9, 2024

Also, I have a last clarifying question: reading wasi-libc I assumed that all Pollable are abstracted as a Socket but this socket is a "fake" one in the sense that we only use it to convey the "readiness" of the underlying Pollable irrespectively of whether it is actually obtained through wasi:sockets.

WASIp2 represents things like TCP and UDP sockets as resources (e.g. tcp-socket) which have methods for binding, listening, connecting, etc. Once a socket is connected, you can get its input-stream and output-stream, which are also resources. And each of those can be used to obtain a pollable representing read and write readiness, respectively.

Consequently, wasi-libc must keep track of up to six resource handles for each socket:

  • a tcp-socket or udp-socket, depending on the socket type
  • a pollable representing readiness of any in-progress bind, listen, connect, or accept operation
  • (if connected) an input-stream, an output-stream, and one pollable each for reading and writing

So you can think of a wasi-libc socket file descriptor as uniquely identifying a bundle of resource handles, the number and types of which depends on the state that socket is in.

Adding support for binding, listening and accepting WASIp2 sockets in mio would amount to adding match cases for e.g. tcp_socket_state_tag_t::TCP_SOCKET_STATE_UNBOUND, tcp_socket_state_tag_t::TCP_SOCKET_STATE_BOUND, and tcp_socket_state_tag_t::TCP_SOCKET_STATE_LISTENING and using the socket_pollable field of tcp_socket_t to await transitions to the next state.

Does that help? Happy to go into more detail if desired. Also, the wasi-sockets docs are quite thorough if you haven't perused them yet.

@dicej
Copy link
Author

dicej commented Oct 9, 2024

BTW, this PR doesn't include UDP support either, again because that hasn't been a priority for me. Shouldn't be hard to add as a follow-up PR.

@raskyld
Copy link

raskyld commented Oct 9, 2024

Thanks a lot for the time you took answering me 🙏
Your answer is super clear!

My question was more related to how pollable obtained from types not related to sockets are usable in mio context, for example, the streams you mentioned can also be obtained through the wasi:http world: https://github.com/WebAssembly/wasi-http/blob/main/wit/types.wit#L510

In this case, I suspect you still have the streams and the pollable but not the sockets resource handles.
Could you still use those resource handles to Poll them in mio? IIUC, the philosophy of the crate is to build upon any event source independently of the platform primitive under of that but I may misunderstand it!

@dicej
Copy link
Author

dicej commented Oct 9, 2024

My question was more related to how pollable obtained from types not related to sockets are usable in mio context, for example, the streams you mentioned can also be obtained through the wasi:http world: https://github.com/WebAssembly/wasi-http/blob/main/wit/types.wit#L510

Oh right, great question. Yeah, I think we'd need to add a new, WASIp2-only API for registering pollables that have no corresponding file descriptors (and likewise for tokio, presumably). This PR clearly doesn't include such a thing, but it's something we could add in another PR.

Another approach would be to add an API to wasi-libc that accepts an arbitrary pollable and allocates a file descriptor for it.

@badeend and @sunfishcode might have thoughts about this.

@raskyld
Copy link

raskyld commented Oct 9, 2024

Alright thanks for the answer! Reading WebAssembly/wasi-libc#542 it seems to me that the question of allocating fd to pollable was among the initial options, I wonder though how would you implement stuff like fstat for those fd 🤔 but let's not pollute your MR!

To get back to the main subject,
I just remembered I wondered why you use atomic types and mutex in the PR since wasip2 is single-threaded?
Is it because of the thread proposal ?

@dicej
Copy link
Author

dicej commented Oct 9, 2024

To get back to the main subject, I just remembered I wondered why you use atomic types and mutex in the PR since wasip2 is single-threaded? Is it because of the thread proposal ?

Honestly, I just copied that from the existing WASIp1 implementation. I wrote this code almost a year ago and only came back to it yesterday, so it's not super fresh in my mind, but that part at least came straight from the existing code. I'm guessing the original code either needed it to make the compiler happy (e.g. make Selector Send and Sync) or for future-proofing. Arc and Mutex are roughly equivalent to Rc and RefCell on single-threaded Wasm, anyway, so there shouldn't be a performance penalty.

Comment on lines +68 to +72
[target.'cfg(all(target_os = "wasi", not(target_env = "p2")))'.dependencies]
wasi = "0.11.0"

[target.'cfg(all(target_os = "wasi", target_env = "p2"))'.dependencies]
wasi = "0.13.3"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really the correct way to go about this? Only 0.11 supports p1?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at https://docs.rs/wasi/latest/wasi/
I would say yes, the last mention I see of p1 is on 0.11

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -175,7 +175,7 @@ where
}
}

#[cfg(target_os = "wasi")]
#[cfg(all(target_os = "wasi", not(target_env = "p2")))]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do this?

Suggested change
#[cfg(all(target_os = "wasi", not(target_env = "p2")))]
#[cfg(all(target_os = "wasi", target_env = "p1"))]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I did it that way is because not(target_env = "p2") is backwards compatible with rustc versions prior to the introduction of the wasm32-wasip2 target (i.e. all stable rustc versions as of this writing). See here for further discussion. It does mean we'll need to make changes when p3 becomes available, though. Happy to do whatever you think is best here.

@@ -87,10 +87,10 @@ impl TcpStream {
/// entries in the routing cache.
///
/// [write interest]: Interest::WRITABLE
#[cfg(not(target_os = "wasi"))]
#[cfg(any(not(target_os = "wasi"), target_env = "p2"))]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use not(all(...)) here?

Copy link
Author

@dicej dicej Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you mean not(all(target_os = "wasi", target_env = "p1")) then the same rationale I gave above for not using target_env = "p1" applies. Again, though, happy to change it if we're not concerned about backwards rustc compatibility.

@raskyld
Copy link

raskyld commented Oct 10, 2024

After a good night of sleep, I realised the name of the PR should probably mention that it only adds support for established wasi:sockets so it's clear that we will need follow-up PRs for other pollable and for initiating connections.

WDYT?

@dicej
Copy link
Author

dicej commented Oct 10, 2024

After a good night of sleep, I realised the name of the PR should probably mention that it only adds support for established wasi:sockets so it's clear that we will need follow-up PRs for other pollable and for initiating connections.

WDYT?

Makes sense; I'll add some TODO comments to the code and to the commit message.

// TODO tokio-rs#1: Add a public, WASIp2-only API for registering
// `wasi::io::poll::Pollable`s directly (i.e. those which do not correspond to
// any `wasi-libc` file descriptor, such as `wasi:http` requests).
//
// TODO tokio-rs#2: Add support for binding, listening, and accepting.  This would
// involve adding cases for `TCP_SOCKET_STATE_UNBOUND`,
// `TCP_SOCKET_STATE_BOUND`, and `TCP_SOCKET_STATE_LISTENING` to the `match`
// statements in `Selector::select`.
//
// TODO tokio-rs#3: Add support for UDP sockets.  This would involve adding cases for
// the `UDP_SOCKET_STATE_*` tags to the `match` statements in
// `Selector::select`.

Signed-off-by: Joel Dice <joel.dice@fermyon.com>
Copy link
Collaborator

@Thomasdezeeuw Thomasdezeeuw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really have much time to review this.

Can we split this up in multiple prs as this seems to do multiple things.

  1. A minimal pr that adds support for v2, no adding of v2 functionality.
  2. The selector rewrite (not sure why this is needed)
  3. Any v2 additional we make, such as support for more API

let mut subscriptions = self.subscriptions.lock().unwrap();

let mut states = Vec::new();
for (fd, subscription) in subscriptions.deref() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think looping of all active subscriptions is a great idea. Nor is the allocation for states.

Why switch from a Vec to a HashMap?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to optimize by reducing the number of allocations if desired. Would you prefer that I update this PR or leave that for a follow-up PR?

Why switch from a Vec to a HashMap?

I was aiming for O(1) lookups by file descriptor in the IoSourceState implementation.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to optimize by reducing the number of allocations if desired. Would you prefer that I update this PR or leave that for a follow-up PR?

Why switch from a Vec to a HashMap?

I was aiming for O(1) lookups by file descriptor in the IoSourceState implementation.

It might be slower when taking into account hashing tho 🤔

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be slower when taking into account hashing tho 🤔

Possibly; I can switch back to using a Vec if preferred.

@dicej
Copy link
Author

dicej commented Oct 14, 2024

Can we split this up in multiple prs as this seems to do multiple things.

1. A minimal pr that adds support for v2, no adding of v2 functionality.

2. The selector rewrite (not sure why this is needed)

3. Any v2 additional we make, such as support for more API

Items 1 and 2 are intertwined. None of the WASIp1 APIs are available in WASIp2, so a new selector implementation is needed to target the new APIs. There is an adapter which implements (most of) the WASIp1 APIs in terms of their WASIp2 counterparts, but wasi-libc doesn't use the adapter for sockets -- it uses the WASIp2 APIs directly in order to access the much broader socket support that WASIp2 provides.

As I mentioned above, we do have a few different options for a WASIp2 selector implementation (use POSIX poll(2), use the WASIp2 APIs directly, or use a mix of POSIX file descriptors and direct WASIp2 APIs for maximum compatibility), but reusing the WASIp1 implementation is not one of those options.

Totally agreed that we can leave item 3 in your list for a later PR, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants