Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wasm2c: implement the bulk memory operations proposal #1877

Merged
merged 9 commits into from
Sep 21, 2022

Conversation

keithw
Copy link
Member

@keithw keithw commented Mar 31, 2022

(This PR is sequenced behind #1875 and #1814.)

Support bulk memory operations in wasm2c (the next step in the checklist at #1853 (comment)).

This PR also adds test coverage:

  • The current testsuite's elem.wast test (we had previously been using a pre-bulk-memory version in old-spec. These "old" tests are tracked at wasm2c no longer passes many spec tests #1737.)
  • The current testsuite's bulk.wast, memory_copy.wast, memory_fill.wast, and memory_init.wast (these were added in the bulk-memory proposal)
  • The current multi-memory proposal's load.wast and store.wast tests (we had previously been omitting these because they require bulk-memory support, discussed at wasm2c: run multi-memory tests #1834. Another multi-memory tests requires reference-types support, and the final multi-memory test requires SIMD.)
  • Bulk-memory proposal versions of the imports.wast, linking.wast, table_copy.wast, and table_init.wast tests. (The current versions of these are waiting on reference-types support.)

@keithw keithw requested a review from sbc100 March 31, 2022 21:40
@keithw keithw changed the title wasm2c: implement the bulk-memory operations proposal wasm2c: implement the bulk memory operations proposal Mar 31, 2022
@keithw keithw force-pushed the bulk-memory branch 2 times, most recently from dae6e37 to 7c8b9a5 Compare April 4, 2022 10:43
@keithw keithw requested a review from kripken April 4, 2022 17:17
Copy link
Member

@binji binji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! Since I'm not maintaining this anymore you may want to wait for further review, but I wanted to give you some feedback

src/c-writer.cc Outdated Show resolved Hide resolved
src/c-writer.cc Outdated Show resolved Hide resolved
src/c-writer.cc Outdated Show resolved Hide resolved
src/c-writer.cc Outdated Show resolved Hide resolved
src/c-writer.cc Outdated Show resolved Hide resolved
test/run-spec-wasm2c.py Outdated Show resolved Hide resolved
test/run-spec-wasm2c.py Outdated Show resolved Hide resolved
test/run-spec-wasm2c.py Outdated Show resolved Hide resolved
wasm2c/README.md Outdated Show resolved Hide resolved
wasm2c/examples/fac/fac.wat Outdated Show resolved Hide resolved
@keithw
Copy link
Member Author

keithw commented Apr 5, 2022

@binji Thank you muchly for the reviews! I wish GitHub had a way for us to mark the stacked PRs more clearly -- if possible, could we take these in order (so #1814 [module instancing] would be next, before we get to this one [bulk memory], reference types [#1887], and tolerating partly invalid modules [#1888])? Some of your comments here are things we'll need to fix in #1814 where the relevant commits originate, and then our plan was to rebase the later PRs on top of whatever ultimately lands.

keithw added a commit to fix-project/wabt that referenced this pull request Apr 6, 2022
@keithw keithw force-pushed the bulk-memory branch 2 times, most recently from fdec286 to f5d4deb Compare April 6, 2022 11:34
@binji
Copy link
Member

binji commented Apr 15, 2022

Sorry about reviewing out of order, I should have looked more closely at the description!

@keithw keithw force-pushed the bulk-memory branch 3 times, most recently from 85e60d0 to f36fb7c Compare April 28, 2022 05:45
@sbc100
Copy link
Member

sbc100 commented Apr 28, 2022

Does this change really depend on instancing ( #1814)? I wonder if it can land first?

@keithw
Copy link
Member Author

keithw commented Apr 28, 2022

Does this change really depend on instancing ( #1814)? I wonder if it can land first?

It's possible, but it would be a bit of work to backport this to a pre-instanced implementation. I guess I thought we had coalesced on this ordering when we talked at #1853 (comment).

The major "instance-related" design decisions/consequences here are:

  • as before, we store each segment as a static const array at file scope (belonging to the "module" itself), but now each module instance has a boolean (in a bitfield) to track whether the corresponding segment has been dropped. (A drop instruction doesn't actually free memory here; even if all extent instances have dropped the segment, the caller might later create another instance of the same module...)

  • before bulk memory, it was possible to use element segments only at "wasm2c-time", to write the code that initializes the table as part of init_table, and then not actually store the element segment itself. But after bulk memory, the element segment has to be stored so that it can be used at runtime in a table.init. To do that, we have to be a bit careful in choosing the representation of the init expression, because at runtime the members of a funcref-type table are function instances (which close over the originating module), but when writing down the element segment itself (before any module has been instantiated with its imports), you don't know what the originating module will be. So, we represent the elem segment init expr as a local type index, a funcref, and an offset within the module instance structure to the originating module pointer. At runtime, the implementation of table_init converts this to a global type index, the funcref, and the actual pointer to the originating module (looked up from the module instance structure).

@sbc100
Copy link
Member

sbc100 commented Apr 28, 2022

Does this change really depend on instancing ( #1814)? I wonder if it can land first?

It's possible, but it would be a bit of work to backport this to a pre-instanced implementation. I guess I thought we had coalesced on this ordering when we talked at #1853 (comment).

The major "instance-related" design decisions/consequences here are:

  • as before, we store each segment as a static const array at file scope (belonging to the "module" itself), but now each module instance has a boolean (in a bitfield) to track whether the corresponding segment has been dropped. (A drop instruction doesn't actually free memory here; even if all extent instances have dropped the segment, the caller might later create another instance of the same module...)
  • before bulk memory, it was possible to use element segments only at "wasm2c-time", to write the code that initializes the table as part of init_table, and then not actually store the element segment itself. But after bulk memory, the element segment has to be stored so that it can be used at runtime in a table.init. To do that, we have to be a bit careful in choosing the representation of the init expression, because at runtime the members of a funcref-type table are function instances (which close over the originating module), but when writing down the element segment itself (before any module has been instantiated with its imports), you don't know what the originating module will be. So, we represent the elem segment init expr as a local type index, a funcref, and an offset within the module instance structure to the originating module pointer. At runtime, the implementation of table_init converts this to a global type index, the funcref, and the actual pointer to the originating module (looked up from the module instance structure).

No problem. If its not a simple rebase we can keep the sequence as is

archlinux-github pushed a commit to archlinux/svntogit-community that referenced this pull request Jun 10, 2022
The upgrade broke the Firefox build. wasm2c can't handle bulk memory ops yet.

https://bugzilla.mozilla.org/show_bug.cgi?id=1773200
WebAssembly/wabt#1877



git-svn-id: file:///srv/repos/svn-community/svn@1230052 9fca08f4-af9d-4005-b8df-a31f2cc04f65
archlinux-github pushed a commit to archlinux/svntogit-community that referenced this pull request Jun 10, 2022
The upgrade broke the Firefox build. wasm2c can't handle bulk memory ops yet.

https://bugzilla.mozilla.org/show_bug.cgi?id=1773200
WebAssembly/wabt#1877

git-svn-id: file:///srv/repos/svn-community/svn@1230052 9fca08f4-af9d-4005-b8df-a31f2cc04f65
@keithw keithw force-pushed the bulk-memory branch 4 times, most recently from e160874 to 13642fc Compare August 16, 2022 01:15
@keithw keithw force-pushed the bulk-memory branch 2 times, most recently from de1e4dd to 1793a5d Compare September 16, 2022 16:58
@keithw
Copy link
Member Author

keithw commented Sep 20, 2022

@sbc100 Do you want to take a look at this one? (It will be helpful for updating the testsuite as more of the spec tests have started to require bulk-memory.)

Copy link
Member

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

lgtm with comments

@@ -1,4 +1,5 @@
# https://stackoverflow.com/a/47801116
file(READ ${in} content)
set(content "const char* ${symbol} = R\"w2c_template(${content})w2c_template\";")
string(REGEX REPLACE "(.[^\n]*\n)" "R\"w2c_template(\\1)w2c_template\"\n" content "${content}")
set(content "const char* ${symbol} = ${content};")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MSVC has a limit on the length of string literals that was getting exceeded (I think because wasm2c.declarations.c is now longer than 16KiB). This switches it to being one string literal per line of source.

@@ -444,4 +444,85 @@ static float wasm_sqrtf(float x) {
return sqrtf(x);
}

static inline void memory_fill(wasm_rt_memory_t* mem, u32 d, u32 val, u32 n) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to somehow only emit this code when bulk memory is enabled? Perhaps we could define _WASM2C_HAS_BULK_MEMORY and make this conditional? I guess it doesn't make much difference? Maybe tiny modules would compile marginally faster?

Maybe not worth it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do this if you want, but given that bulk memory is enabled by default (and has been for almost a year), it seems like a niche case... and it's not like these are particularly complicated functions to compile. (Some of the math functions are arguably more complicated and we emit those unconditionally...)

wasm2c/examples/fac/fac.c Show resolved Hide resolved
}

static void init_elem_instances(Z_fac_instance_t *instance) {
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It it worth eliding all these function when they are empty (to make the output more readable?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, done in #1999 which hopefully we can do as a follow-on to this one.

Write("bool ", "data_segment_dropped_", GetGlobalName(data_segment->name),
" : 1;", Newline());
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should have some more examples of wasm2c output actual checked in so that when reviewing these changes we can see the effect they have on the output? Not sure what the best way to do that is.

Copy link
Member Author

@keithw keithw Sep 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also like this idea but not sure how far to take it... some options are probably:

  • check in rot13.c like we do fac.c?
  • check in a few representative outputs of running wasm2c on the spec testsuite?
  • check in every single output of running wasm2c on the spec testsuite?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about a few hand-crafted tests under test/wasm2c that include the input wat and the output c code in the same same file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a first stab at this in #2006.

src/c-writer.cc Outdated Show resolved Hide resolved
@keithw keithw enabled auto-merge (squash) September 21, 2022 05:41
@keithw keithw merged commit 608d23e into WebAssembly:main Sep 21, 2022
@keithw keithw deleted the bulk-memory branch September 21, 2022 06:00
talg pushed a commit to PLSysSec/wabt that referenced this pull request Sep 26, 2022
matthias-blume pushed a commit to matthias-blume/wabt that referenced this pull request Dec 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants