Support Wasm Text Format (%.wat) #1

codefromthecrypt · 2021-11-25T00:30:24Z

We currently decode Wasm Binary Format. This means that usage depends on an external tool that targets it (ex tinygo -> wasm).

Wasm also defines a Text Format, which if supported gives a number of pros:

unit tests can define inlined code and trigger specific edge cases
new users can familiarize themselves before learning a toolchain. Ex. following along in a book
wasm used for benchmarks will have more predictable speed and allocations as the translation is direct (not embedding GC impl etc)
power users who want better performance or less bloat can write the text format directly instead of another language first.
those supplying text format could be optimized for a single pass into wazeroir intermediate format

Non-goals:

supporting the unreleased 1.1 specification. This will only be 1.0
wast format, this will only be wat format
exporting anything except a Parse function and mandatory options

Optional and should be in follow-up pull requests, not a Big Bang:

writing %.wasm from %.wat (or at all) is optional
While a go-native wat2wasm is a nice to have, it can be decoupled

There will be some challenges and choices along the way.

The text format supports a mix of expression styles: S expressions vs stack
The text format supports both indexed parameters and named ones
Validation may overlap with what's already done when decoding the binary
There are pros and cons about single pass vs parsing into Module first

Development can and should be done incrementally. For example, by making all code internal, we can convert simplest to most complicated wasm already used in this project into inlined or testdata wat for test cases or benchmarks.

Those working around this meanwhile, can install wat2wasm on their system then reference it from go, similar to this, then using wasm.DecodeModule() on the result:

// requireWasm temporarily calls system `wat2wasm` until we implement it here
func requireWasm(t *testing.T, wat string) []byte {
	dir := t.TempDir()
	watFile := path.Join(dir, "temp.wat")
	require.NoError(t, os.WriteFile(watFile, []byte(wat), 0o600))

	wasmFile := path.Join(dir, "temp.wasm")
	require.NoError(t, exec.Command("wat2wasm", watFile, "-o", wasmFile).Run())
	bytes, err := os.ReadFile(wasmFile)
	require.NoError(t, err)
	return bytes
}

To implement this well will include parsing and lexing well, including retention of line and column information on errors. For example, goawk and mugo may be helpful background reading.

The text was updated successfully, but these errors were encountered:

benhoyt · 2021-11-25T00:41:06Z

Thanks for the shout-out to GoAWK. :-) Regarding line and column info: most GoAWK errors are errors during parsing, where I have line and column info in the tokens being parsed, so I can include that in the error. However, I don't store this in the syntax tree (AST nodes), so the few runtime errors that do exist don't include col/line info. If doing it again I'd probably add that info to AST nodes, though.

codefromthecrypt · 2021-12-16T04:40:18Z

went with a loop instead of an iterator api, and only implemented the basic lexer so far. the column and line positions will indeed make it upwards. next step is a proof of concept parser, later finish the lexing which floating points will be the most tedious! tetratelabs/wazero#63

codefromthecrypt · 2021-12-18T09:45:41Z

in studying next steps one thing I recognize is at some point we need to do full wat2wasm to satisfy FunctionInstance.Body which is a field that contains the binary encoding of the function.

Meanwhile, I'm focused on how to surface the stream of parsing. It appears that routinely peek 2 known tokens is needed. Ex lparen and a field name (keyword). It may be efficient to give a windowing function of up to N tokens to allow the parser function to do more. At the moment, I think only seeing one token isn't useful at all, and you can see wabt for example routinely needs 2 https://github.com/WebAssembly/wabt/blob/main/src/wast-parser.h#L79-L96

codefromthecrypt · 2022-05-28T01:44:06Z

TL;DR: I think we should delete the text compiler to focus energy on the ever expanding responsibilities catering to other core specs. 👍 or if you 👎 please add a comment why and how we can solve the labor issue.

WebAssembly 2.0 is dramatically larger and features beyond that larger still. We have limited capacity as specs expand.
Few "real" integrations actually use the text format. For example, most distribute the binary format
The text format isn't the larger concern of compilers, as many more ask about other languages like Golang (TinyGo)
We already need to use wabt routinely to address things like spec tests
The existing text compiler needs rework which costs attention better spent towards more requested things like DWARF
The existing text compiler could be spun into a dependency free repo and used near seamlessly should people become available to do it, and a separate repo removes weight.

Even though I spent personally months on this, I think the best choice is to delete this code, for better attention to the core responsibilities of a runtime, and also before we release 1.0

codefromthecrypt · 2022-05-28T01:56:56Z

PS I'm totally game at least internally to divert attention to a WasmBuilder, which starting with internal code be able to materialize a module in the binary format. This is significantly easier to do than parsing and could produce the simple modules we tend to use in unit tests. In other words, removing the text compiler doesn't imply a huge amount of checked in binaries in this repo.. we could spend the energy in a different way to support ad-hoc modules, and providing an alternate utility could be in the same change that removes the text compiler.

codefromthecrypt · 2022-05-28T02:12:29Z

If folks are wondering why I piped up now, it is more than just the things on wazero 1.0 or webassembly 2.0. What I noticed was that the next WASI is built on the component model which extends grammar even further.

https://github.com/WebAssembly/component-model/blob/bcc2002c8b74381004c363f7b04853c1e636d9ca/design/mvp/Explainer.md

I think the best focus of this project is the runtime, and particularly laboring towards the best JIT we can do, and best dev/debug story we can do. The text format has very little to do with this, yet an ever expanding definition. Choosing battles wisely is ditching it.

EOF

codefromthecrypt · 2022-05-28T04:44:43Z

What I'll do is create a new repo called watzero and migrate code in such a way that it is standalone. I'll temporarily add a dependency back here to watzero then replace those parts with a wasm builder api before we cut 1.0

That's the most decoupled plan I can think of which also gives hope if someone wants to help move forward a dependency free wat2wasm go lib. If watzero doesn't end up doing that, we'll archive it.

codefromthecrypt · 2022-05-29T01:05:53Z

I started to do the migration in a separate repo, but that didn't work out well. I started over doing it in internal/watzero, which we can then git subtree out to its own repo. Doing it this way helps as it allows less thrash while sorting out the code dependencies.

This drops the text format (%.wat) and renames InstantiateModuleFromCode to InstantiateModuleFromBinary as it is no longer ambiguous. We decided to stop supporting the text format as it isn't typically used in production, yet costs a lot of work to develop. Given the resources available and the increased work added with WebAssembly 2.0 and soon WASI 2, we can't afford to spend the time on it. The old parser is used only internally and will eventually be moved to its own repository named watzero, possibly towards archival. See #59 Signed-off-by: Adrian Cole <adrian@tetrate.io>

codefromthecrypt · 2022-09-19T06:11:47Z

planning to cancel the text parser altogether in #2 Doing so can help us keep the rest of the code alive, which has more utility and easier to maintain. If we start having enough help, the prior commit has the last working version.

codefromthecrypt · 2022-09-20T00:06:57Z

closing as won't fix so that we can focus on the other parts. It seems a lot more people are interested in compiling go into wasm, not wat2wasm

codefromthecrypt mentioned this issue Nov 25, 2021

Addresses some IDE warnings relating to GoDoc comments benhoyt/goawk#76

Merged

codefromthecrypt mentioned this issue Jan 6, 2022

some example show calling wasm written with go office tetratelabs/wazero#74

Closed

codefromthecrypt mentioned this issue Apr 25, 2022

wazero 1.0 tetratelabs/wazero#506

Closed

13 tasks

codefromthecrypt mentioned this issue Jun 1, 2022

Drops support for the WebAssembly text format tetratelabs/wazero#614

Merged

codefromthecrypt mentioned this issue Aug 16, 2022

Move wasm, wasi_snapshot_preview1, ieee754 and similar folders out of internal into utils or helpers tetratelabs/wazero#745

Closed

codefromthecrypt mentioned this issue Aug 30, 2022

Removes Text Format tetratelabs/wazero#778

Merged

codefromthecrypt transferred this issue from tetratelabs/wazero Aug 30, 2022

codefromthecrypt closed this as completed Sep 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Wasm Text Format (%.wat) #1

Support Wasm Text Format (%.wat) #1

codefromthecrypt commented Nov 25, 2021

benhoyt commented Nov 25, 2021

codefromthecrypt commented Dec 16, 2021

codefromthecrypt commented Dec 18, 2021

codefromthecrypt commented May 28, 2022

codefromthecrypt commented May 28, 2022

codefromthecrypt commented May 28, 2022 •

edited

Loading

codefromthecrypt commented May 28, 2022

codefromthecrypt commented May 29, 2022

codefromthecrypt commented Sep 19, 2022

codefromthecrypt commented Sep 20, 2022

Support Wasm Text Format (%.wat) #1

Support Wasm Text Format (%.wat) #1

Comments

codefromthecrypt commented Nov 25, 2021

benhoyt commented Nov 25, 2021

codefromthecrypt commented Dec 16, 2021

codefromthecrypt commented Dec 18, 2021

codefromthecrypt commented May 28, 2022

codefromthecrypt commented May 28, 2022

codefromthecrypt commented May 28, 2022 • edited Loading

codefromthecrypt commented May 28, 2022

codefromthecrypt commented May 29, 2022

codefromthecrypt commented Sep 19, 2022

codefromthecrypt commented Sep 20, 2022

codefromthecrypt commented May 28, 2022 •

edited

Loading