C++20 co_await support for Embind promises #20420

RReverser · 2023-10-09T14:13:02Z

This adds support for co_await-ing Promises represented by emscripten::val.

The surrounding coroutine should also return emscripten::val, which will be a promise representing the whole coroutine's return value.

Note that this feature uses LLVM coroutines and so, doesn't depend on either Asyncify or JSPI. It doesn't pause the entire program, but only the coroutine itself, so it serves somewhat different usecases even though all those features operate on promises.

Nevertheless, if you are not implementing a syscall that must behave as-if it was synchronous, but instead simply want to await on some async operations and return a new promise to the user, this feature will be much more efficient.

Here's a simple benchmark measuring runtime overhead from awaiting on a no-op Promise repeatedly in a deep call stack:

using namespace emscripten;

// clang-format off
EM_JS(EM_VAL, wait_impl, (), {
  return Emval.toHandle(Promise.resolve());
});
// clang-format on

val wait() { return val::take_ownership(wait_impl()); }

val coro_co_await(int depth) {
  co_await wait();
  if (depth > 0) {
    co_await coro_co_await(depth - 1);
  }
  co_return val();
}

val asyncify_val_await(int depth) {
  wait().await();
  if (depth > 0) {
    asyncify_val_await(depth - 1);
  }
  return val();
}

EMSCRIPTEN_BINDINGS(bench) {
  function("coro_co_await", coro_co_await);
  function("asyncify_val_await", asyncify_val_await, async());
}

And the JS runner also comparing with pure-JS implementation:

import Benchmark from 'benchmark';
import initModule from './async-bench.mjs';

let Module = await initModule();
let suite = new Benchmark.Suite();

function addAsyncBench(name, func) {
	suite.add(name, {
		defer: true,
		fn: (deferred) => func(1000).then(() => deferred.resolve()),
	});
}

for (const name of ['coro_co_await', 'asyncify_val_await']) {
  addAsyncBench(name, Module[name]);
}

addAsyncBench('pure_js', async function pure_js(depth) {
  await Promise.resolve();
  if (depth > 0) {
    await pure_js(depth - 1);
  }
});

suite
  .on('cycle', function (event) {
    console.log(String(event.target));
  })
  .run({async: true});

Results with regular Asyncify (I had to bump up ASYNCIFY_STACK_SIZE to accomodate said deep stack):

> ./emcc async-bench.cpp -std=c++20 -O3 -o async-bench.mjs --bind -s ASYNCIFY -s ASYNCIFY_STACK_SIZE=1000000
> node --no-liftoff --no-wasm-tier-up --no-wasm-lazy-compilation --no-sparkplug async-bench-runner.mjs

coro_co_await x 727 ops/sec ±10.59% (47 runs sampled)
asyncify_val_await x 58.05 ops/sec ±6.91% (53 runs sampled)
pure_js x 3,022 ops/sec ±8.06% (52 runs sampled)

Results with JSPI (I had to disable DYNAMIC_EXECUTION because I was getting RuntimeError: table index is out of bounds in random places depending on optimisation mode - JSPI miscompilation?):

> ./emcc async-bench.cpp -std=c++20 -O3 -o async-bench.mjs --bind -s ASYNCIFY=2 -s DYNAMIC_EXECUTION=0
> node --no-liftoff --no-wasm-tier-up --no-wasm-lazy-compilation --no-sparkplug --experimental-wasm-stack-switching async-bench-runner.mjs

coro_co_await x 955 ops/sec ±9.25% (62 runs sampled)
asyncify_val_await x 924 ops/sec ±8.27% (62 runs sampled)
pure_js x 3,258 ops/sec ±8.98% (53 runs sampled)

So the performance is much faster than regular Asyncify, and on par with JSPI.

Fixes #20413.

sbc100 · 2023-10-09T20:16:38Z

@tlively who is the maintainer of the current C/C++ promise integration.

RReverser · 2023-10-09T20:39:07Z

who is the maintainer of the current C/C++ promise integration.

Yeah I added him as a reviewer, but then I'll reiterate like I did in the issue that this feature is separate from Asyncify / JSPI / promise.h and doesn't rely on them in any way (by design).

Unlike in Asyncify / JSPI, in this case only local coroutine is paused, and that's done entirely by C++ / LLVM compiling those co_await into a state machine, so the Wasm engine doesn't know anything about promises. We're only providing Embind bindings for that operator.

RReverser · 2023-10-12T13:38:24Z

I'll add that one particular motivation for this is that it's impossible to use proxying APIs with Asyncify at the moment; even if it was, it would be difficult because different threads could be competing for Asyncify state which can't handle queueing.

So when I needed to do something like

proxySync([] {
  val foo = doSomethingSynchronous();
  val bar = foo.call<void>("someAsyncMethod").await();
  val baz = somethingSynchronousAgain(bar);
  ...another await..
});

it had to be rewritten so that each synchronous call is proxied separately in one proxySync-based helper, and each asynchronous call has to be wrapped into another helper that subscribes to promise and uses proxySyncWithCtx, so it kept going forth and back between threads in a rather ugly mix of code.

With coroutines this is not a problem, as I can proxy the entire coroutine execution to the main thread in one go and use co_await for awaiting those intermediate values.

RReverser · 2023-10-19T15:14:44Z

Ping, would love a review on this. I have a project where having co_await would come very handy.

brendandahl

Having not used c++20 coroutines at all yet, this seems reasonable and a pretty nice feature to avoid asyncify/jspi. I'd still like to hear @tlively's thoughts since he had started doing something with coroutines.

site/source/docs/api_reference/val.h.rst

system/include/emscripten/val.h

test/embind/test_val_coro.cpp

RReverser · 2023-10-20T23:04:10Z

and a pretty nice feature to avoid asyncify/jspi

@brendandahl Btw, just wanted to add that I found it has its place in combination with Asyncify too, not necessarily instead of.

In particular, it helps to write a helper coroutine that does a bunch of co_await that's handled by C++ compiler, and then do a single my_func().await() to wait for all of them to finish, this time with Asyncify.

This way, when you need to pause the entire app, you can do so with just one expensive unwind/rewind by Asyncify instead of having lots of unwinds/rewinds for every individual async operation inside such async-heavy function.

sbc100 · 2023-10-22T07:06:08Z

I'm afraid I haven't yet understood C++ co-routines or how that can be useful here. I would like to take some time to better understand this change, but if @tlively or @brendandahl think they have a good handle on it I'll defer to them.

tlively · 2023-10-22T19:53:44Z

Meanwhile I don't understand any of the embind stuff, nor do I understand the C++20 co_await stuff off the top of my head 😅

I'll have to dig into cppreference to understand the coroutine stuff here. @brendandahl and @sbc100, do you want to schedule some time to review this collaboratively?

RReverser · 2023-10-22T21:30:30Z

Lol I love where this is going 😅

To be fair, I didn't know anything about C++ coroutines / co_await before trying to implement it either, and the terminology C++ used for this stuff is confusing / different from other languages, which didn't help... but in the end it works.

@brendandahl and @sbc100, do you want to schedule some time to review this collaboratively?

FWIW I'd be happy to join in for a meeting if that helps, assuming Europe-friendly time.

sbc100 · 2023-10-23T00:20:49Z

Lol I love where this is going 😅

To be fair, I didn't know anything about C++ coroutines / co_await before trying to implement it either, and the terminology C++ used for this stuff is confusing / different from other languages, which didn't help... but in the end it works.

@brendandahl and @sbc100, do you want to schedule some time to review this collaboratively?

FWIW I'd be happy to join in for a meeting if that helps, assuming Europe-friendly time.

Sounds good. How about Friday morning. 9am PST?

RReverser · 2023-10-23T00:40:26Z

How about Friday morning. 9am PST?

Hm so it will be Friday evening here... tentatively yes, but if we can do that between Tuesday-Thursday, would be a bit better.

RReverser · 2023-11-02T10:59:52Z

Sounds good. How about Friday morning. 9am PST?

Should we give it a try this week to help unblock this?

sbc100 · 2023-11-02T17:28:48Z

Sounds good. How about Friday morning. 9am PST?

Should we give it a try this week to help unblock this?

9am PST tomorrow? sgtm.. I'll send out an calendar invite.

system/include/emscripten/val.h

test/embind/test_val_coro.cpp

src/library_sigs.js

test/test_core.py

system/include/emscripten/val.h

This adds support for `co_await`-ing Promises represented by `emscripten::val`. The surrounding coroutine should also return `emscripten::val`, which will be a promise representing the whole coroutine's return value. Note that this feature uses LLVM coroutines and so, doesn't depend on either Asyncify or JSPI. It doesn't pause the entire program, but only the coroutine itself, so it serves somewhat different usecases even though all those features operate on promises. Nevertheless, if you are not implementing a syscall that must behave as-if it was synchronous, but instead simply want to await on some async operations and return a new promise to the user, this feature will be much more efficient. Here's a simple benchmark measuring runtime overhead from awaiting on a no-op Promise repeatedly in a deep call stack: ```cpp using namespace emscripten; // clang-format off EM_JS(EM_VAL, wait_impl, (), { return Emval.toHandle(Promise.resolve()); }); // clang-format on val wait() { return val::take_ownership(wait_impl()); } val coro_co_await(int depth) { co_await wait(); if (depth > 0) { co_await coro_co_await(depth - 1); } co_return val(); } val asyncify_val_await(int depth) { wait().await(); if (depth > 0) { asyncify_val_await(depth - 1); } return val(); } EMSCRIPTEN_BINDINGS(bench) { function("coro_co_await", coro_co_await); function("asyncify_val_await", asyncify_val_await, async()); } ``` And the JS runner also comparing with pure-JS implementation: ```js import Benchmark from 'benchmark'; import initModule from './async-bench.mjs'; let Module = await initModule(); let suite = new Benchmark.Suite(); function addAsyncBench(name, func) { suite.add(name, { defer: true, fn: (deferred) => func(1000).then(() => deferred.resolve()), }); } for (const name of ['coro_co_await', 'asyncify_val_await']) { addAsyncBench(name, Module[name]); } addAsyncBench('pure_js', async function pure_js(depth) { await Promise.resolve(); if (depth > 0) { await pure_js(depth - 1); } }); suite .on('cycle', function (event) { console.log(String(event.target)); }) .run({async: true}); ``` Results with regular Asyncify (I had to bump up `ASYNCIFY_STACK_SIZE` to accomodate said deep stack): ```bash > ./emcc async-bench.cpp -std=c++20 -O3 -o async-bench.mjs --bind -s ASYNCIFY -s ASYNCIFY_STACK_SIZE=1000000 > node --no-liftoff --no-wasm-tier-up --no-wasm-lazy-compilation --no-sparkplug async-bench-runner.mjs coro_co_await x 727 ops/sec ±10.59% (47 runs sampled) asyncify_val_await x 58.05 ops/sec ±6.91% (53 runs sampled) pure_js x 3,022 ops/sec ±8.06% (52 runs sampled) ``` Results with JSPI (I had to disable `DYNAMIC_EXECUTION` because I was getting "RuntimeError: table index is out of bounds" in random places depending on optimisation mode - JSPI miscompilation?): ```bash > ./emcc async-bench.cpp -std=c++20 -O3 -o async-bench.mjs --bind -s ASYNCIFY=2 -s DYNAMIC_EXECUTION=0 > node --no-liftoff --no-wasm-tier-up --no-wasm-lazy-compilation --no-sparkplug --experimental-wasm-stack-switching async-bench-runner.mjs coro_co_await x 955 ops/sec ±9.25% (62 runs sampled) asyncify_val_await x 924 ops/sec ±8.27% (62 runs sampled) pure_js x 3,258 ops/sec ±8.98% (53 runs sampled) ``` So the performance is much faster than regular Asyncify, and on par with JSPI. Fixes emscripten-core#20413.

RReverser · 2023-11-03T19:37:59Z

Decided to add a bunch of comments explaining what those special coroutine types & methods do. They somewhat duplicate generic docs about C++ coroutines, but figured they might be useful in code as probably few people have to ever implement them.

system/include/emscripten/val.h

RReverser · 2023-11-07T20:31:45Z

Can someone push new docs to the website please? @kripken IIRC you had to do that manually in the past - is that still the case?

kripken · 2023-11-07T23:32:29Z

Yes. Updated now!

andreamancuso · 2024-07-20T11:38:07Z

Hi, this looks great, I'm getting my hands dirty with coroutines and was wondering if you were planning to add coroutine support to -sFETCH ?

I have been trying to parallelize HTTP requests and do when_all() - I enabled pthreads but am still getting

RReverser requested review from kripken, sbc100, brendandahl and tlively October 9, 2023 14:13

This comment was marked as outdated.

Sign in to view

RReverser mentioned this pull request Oct 9, 2023

C++20 coroutines + Native webassembly promise integration #20413

Closed

This comment was marked as off-topic.

Sign in to view

RReverser force-pushed the emval-coro branch from d76e43d to 1571095 Compare October 9, 2023 22:58

RReverser changed the title ~~co_await support for Embind~~ C++20 co_await support for Embind promises Oct 11, 2023

RReverser force-pushed the emval-coro branch from 1571095 to 1003309 Compare October 16, 2023 16:11

brendandahl reviewed Oct 19, 2023

View reviewed changes

site/source/docs/api_reference/val.h.rst Outdated Show resolved Hide resolved

site/source/docs/api_reference/val.h.rst Outdated Show resolved Hide resolved

system/include/emscripten/val.h Outdated Show resolved Hide resolved

test/embind/test_val_coro.cpp Outdated Show resolved Hide resolved

tlively approved these changes Nov 3, 2023

View reviewed changes

system/include/emscripten/val.h Outdated Show resolved Hide resolved

test/embind/test_val_coro.cpp Outdated Show resolved Hide resolved

src/library_sigs.js Show resolved Hide resolved

sbc100 reviewed Nov 3, 2023

View reviewed changes

test/test_core.py Outdated Show resolved Hide resolved

system/include/emscripten/val.h Outdated Show resolved Hide resolved

RReverser force-pushed the emval-coro branch 2 times, most recently from 44b2c0d to fba6de7 Compare November 3, 2023 18:59

RReverser force-pushed the emval-coro branch from fba6de7 to 8cf4217 Compare November 3, 2023 19:13

RReverser added 5 commits November 3, 2023 19:30

Add a bunch of comments

70069a5

Prettify tests a bit

42410c9

Clarify test strings

688c9e0

Fixup syntax

ea580c8

Fixup syntax again

bef2ee9

RReverser enabled auto-merge (squash) November 3, 2023 19:48

sbc100 reviewed Nov 3, 2023

View reviewed changes

system/include/emscripten/val.h Show resolved Hide resolved

RReverser merged commit 8ecbdb3 into emscripten-core:main Nov 3, 2023
2 checks passed

RReverser deleted the emval-coro branch November 3, 2023 20:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

C++20 co_await support for Embind promises #20420

C++20 co_await support for Embind promises #20420

RReverser commented Oct 9, 2023

This comment was marked as outdated.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

sbc100 commented Oct 9, 2023

RReverser commented Oct 9, 2023

RReverser commented Oct 12, 2023

RReverser commented Oct 19, 2023

brendandahl left a comment

RReverser commented Oct 20, 2023 •

edited

Loading

sbc100 commented Oct 22, 2023

tlively commented Oct 22, 2023

RReverser commented Oct 22, 2023

sbc100 commented Oct 23, 2023

RReverser commented Oct 23, 2023 •

edited

Loading

RReverser commented Nov 2, 2023

sbc100 commented Nov 2, 2023

RReverser commented Nov 3, 2023

RReverser commented Nov 7, 2023

kripken commented Nov 7, 2023

andreamancuso commented Jul 20, 2024 •

edited

Loading

C++20 co_await support for Embind promises #20420

C++20 co_await support for Embind promises #20420

Conversation

RReverser commented Oct 9, 2023

This comment was marked as outdated.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

sbc100 commented Oct 9, 2023

RReverser commented Oct 9, 2023

RReverser commented Oct 12, 2023

RReverser commented Oct 19, 2023

brendandahl left a comment

Choose a reason for hiding this comment

RReverser commented Oct 20, 2023 • edited Loading

sbc100 commented Oct 22, 2023

tlively commented Oct 22, 2023

RReverser commented Oct 22, 2023

sbc100 commented Oct 23, 2023

RReverser commented Oct 23, 2023 • edited Loading

RReverser commented Nov 2, 2023

sbc100 commented Nov 2, 2023

RReverser commented Nov 3, 2023

RReverser commented Nov 7, 2023

kripken commented Nov 7, 2023

andreamancuso commented Jul 20, 2024 • edited Loading

RReverser commented Oct 20, 2023 •

edited

Loading

RReverser commented Oct 23, 2023 •

edited

Loading

andreamancuso commented Jul 20, 2024 •

edited

Loading