-
Notifications
You must be signed in to change notification settings - Fork 256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
threads: fuzz shared-everything-threads
#1756
Conversation
At this stage (after df11990) the fuzzer will trigger an $ cargo +nightly fuzz run run
...
thread '<unnamed>' panicked at crates/wasm-mutate/src/module.rs:46:18:
not implemented
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace It is unclear to me why this hasn't been triggered previously: shouldn't occasionally enabling GC have triggered this failure due to wasm-tools/crates/wasm-smith/src/core.rs Lines 1052 to 1058 in 8c3346e
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great modulo some very minor nitpicks -- thanks!
This adds initial support for fuzzing the WebAssembly side of the shared-everything-threads [proposal]. Eventual addition of the CM builtins will require extensions to this. [proposal]: https://github.com/WebAssembly/shared-everything-threads
@fitzgen, as I revisited this PR, I realized that I wanted to add several missing bits as well as split several of the commits into separate PRs. Since I've substantially changed the PR, let's forget about the previous review, though I do believe I've addressed the comments you made. This PR is currently in an intermediate state and I could use some help figuring out what to do: I could (a) just continue hacking away at this until fuzzing "fully works" but that might take a while due to the large amount of reverse validation I have to add to wasm-smith (i.e., many instructions generate invalid parameters in a shared function context). Another option is to (b) just cordon off the ability to create shared function bodies for now; this would allow generating all the types and what not and for that to be fuzzed while I work on (a). Opinions? That last commit, a17f4f4, is a half-hearted attempt to avoid fuzz errors when fuzzing locally and will go away. |
I'm all for landing incremental progress, so long as the parts that are landing are working. We don't need to take an all-or-nothing approach to adding support for generating new wasm proposals in |
Since it may take a while to implement all the "reverse validation" in wasm-smith's instruction generator, this effectively cordons off that bit of the fuzzer so that we can at least generate shared types in modules. Later, once the instruction generator is smarter, we will remove this filter and start generating shared function bodies.
Like shared function bodies, this can be cordoned off until later.
In 2a2ddd5, I tried to avoid `ref.func` expressions entirely for shared types but unfortunately the other side could still occur: a `ref.func` on an unshared table that generates a shared funcref. This filters out the available types to match sharedness.
Ok, I've removed the last commit and added others that allow this to reach at least 200K fuzz iterations locally. This PR still does not generate shared function bodies but some of the plumbing is now in place at least. The latest fuzz failure happens in the One thing I've started to worry about in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latest fuzz failure happens in the wasm-mutate: ... it may be time to let oss-fuzz start finding them?
Failures are fine yeah, and 200k+ is a good start. I'd recommend maybe ~1h of local fuzz time in the background and if that turns up nothing then it's probably ready for oss-fuzz.
For this failure specifically it might be best to just reject modules with shared types for now in wasm-mutate
. There's a number of mutators in wasm-mutate and many of them are written in such a way to be pretty generic over the input module. My guess is that ConstExpressionMutator
is probably what's doing this and either updating that to just not mutate these expressions with shared
or adding support there is reasonable.
One thing I've started to worry about in wasm-smith is generation speed. I'm not sure what effect this PR has on the number of fuzz executions per second, but I don't like all the added ...map().filter()... logic to get at the shared or unshared sides of the world.
To confirm, I only saw one of those in this PR, is that right?
You're right that generation speed matters, and it's one we've had to optimize for in the past. What you mention of having lists of shared/unshared things I think makes sense. What I'd recommend doing is that whenever an operation needs a linear search to build up a cache for that beforehand and consult the cache instead. The CodeBuilder
class has an *Allocations
type (IIRC) for this where it builds up a bunch of caches for function body generation. You can do that sooner as well though if you need the caches for generation of the module skeleton itself.
Overall the basic rule of thumb is to avoid quadratic behavior. That'll time out on oss-fuzz and happens whenever a linear search is done for an operation such as per-opcode selection. That means that the one loop I found in this PR as-is, a linear search at the top-level of generating functions, is no issue. That's just a one-time processing of the input so there's no problem with that.
crates/wasm-smith/src/core.rs
Outdated
fn arbitrary_storage_type( | ||
&mut self, | ||
u: &mut Unstructured, | ||
must_share: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing we've done in the past historically is to avoid threading through parameters like this and storing them on Self
. Would that perhaps make sense for this must_share
field? That'd help many callers I think since ideally the shared-ness would, in a perfect world, integrate in a few locations but wouldn't permeate throughout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can: if we set it on Self
it would be a module-level decision but we actually need a type-by-type decision. It boils down to: "we just generated a shared
object and we need to ensure that any types inside it are also shared
" — thus, must_share = true
. The unshared case is simpler but we also want to generate it: "we just generated an unshared object and it doesn't matter if the types inside are shared or not" — must_share = false
, which allows us to choose later.
There is another way to do this: we could "bubble up" the shared-ness from down below instead of pushing it down from above. I asked @fitzgen what he preferred and he picked the "push down" approach... so here we are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And to clarify: if it's a module-level flag we would have to set it before calling down and then unset it on the way back up. I figured threading it is more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes sorry I meant for this to be a you-set-it-then-unset-it kind of field where it's not global-per-module, just stored on the side to avoid threading everywhere. This pattern is used for fields like max_type_limit
I believe so I think this'll fit reasonably well into that pattern.
I agree that passing arguments is more explicit, but it's a fair bit of noise much of the time since it's just being forwarded around. The integration with arbitrary_shared
I think might also make the most sense by putting it into Module
itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Take a look at f8b9edb. After that change I ran it for more than an hour without failure racking up ~14m iterations. (Hope we're still emitting shared stuff!).
As @alexcrichton suggested, this holds a `must_share` flag on `Module` and carefully sets it and unsets it using `propagate_shared`.
This adds initial support for fuzzing the WebAssembly side of the shared-everything-threads proposal. Eventual addition of the CM builtins will require extensions to this.