Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support no_std in the wasmparser crate #1493

Merged
merged 2 commits into from
Apr 25, 2024

Conversation

alexcrichton
Copy link
Member

@alexcrichton alexcrichton commented Apr 11, 2024

This commit adds a std feature to the wasmparser crate, enables it
by default, and then enables building the crate when the feature is
disabled. This works by switching imports of wasmparser from std to
either the core or alloc crates which are available in the no_std
context. The main change made in this PR is that maps used by
wasmparser, e.g. HashMap and IndexMap, are now abstracted with
type aliases in wasmparser which unconditionally customize a hash
algorithm parameter. When the std feature is enabled this parameter is
implemented using the standard library's default hashing algorithm. When
std is disabled then this uses the ahash crate internally.

This commit additionally updates CI to both check that wasmparser builds
when the std feature is disabled and that it builds on a target
without std.

@Robbepop
Copy link
Collaborator

Robbepop commented Apr 12, 2024

I really like this direction!

One thing that I did in the wasmparser-nostd fork was to replace all HashMap usage with BTreeMap where possible. The reason for this is that HashMap is potentially attackable in a real no_std environment where there is no possibility to randomly initialize the HashMap's state if the users can control the inputs which is very much possible in a Wasm parser unfortunately. This is also the reason why HashMap is not part of Rust's alloc library.
On the flip side, I did not see any changes in any of the wasmparser benchmarks by doing so. Generally I think Rust's BTreeMap is highly underrated.

The biggest downside is that the indexmap dependency (which is used a lot for wasmparser's component model implementation) uses hashbrown::HashMap for it's no_std support which required me to also fork it with indexmap_nostd. However, that was very burdensome and I'd prefer not to.

@alexcrichton
Copy link
Member Author

I filled out a bit more information about hashers at bytecodealliance/wasmtime#8341 (comment) inspired by your comment, thanks for bringing it up!

If possible I'd prefer to keep hash maps as foundational data structures and avoid switching data structures just based on no-std. I think that the utilities provided by ahash and various other possible tricks we can do should be able to largely retain the DoS guarantees.

@alexcrichton alexcrichton marked this pull request as ready for review April 17, 2024 15:24
@alexcrichton
Copy link
Member Author

I've updated this to be a landable version for bytecodealliance/wasmtime#8341. This now includes CI checks and a bit more documentation.

This commit adds a `std` feature to the `wasmparser` crate, enables it
by default, and then enables building the crate when the feature is
disabled. This works by switching imports of `wasmparser` from `std` to
either the `core` or `alloc` crates which are available in the `no_std`
context. The main change made in this PR is that maps used by
`wasmparser`, e.g. `HashMap` and `IndexMap`, are now abstracted with
type aliases in `wasmparser` which unconditionally customize a hash
algorithm parameter. When the `std` feature is enabled this parameter is
implemented using the standard library's default hashing algorithm. When
`std` is disabled then this uses the `ahash` crate internally.

This commit additionally updates CI to both check that wasmparser builds
when the `std` feature is disabled and that it builds on a target
without `std`.
Comment on lines +112 to +126
// When the `std` feature is NOT active then rely on `ahash::RandomState`. That
// relies on ASLR by default for randomness.
#[derive(Default, Clone, Debug)]
#[cfg(not(feature = "std"))]
struct RandomStateImpl;

#[cfg(not(feature = "std"))]
impl BuildHasher for RandomStateImpl {
type Hasher = ahash::AHasher;

#[inline]
fn build_hasher(&self) -> ahash::AHasher {
ahash::RandomState::new().build_hasher()
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense expose a method for manually setting the random state for nostd users that happen to have some source of randomness we don't (and can't) know about?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is sort of already handled by the ahash crate itself although that's not quite the same as providing it through wasmparser itself. My thinking is given that this is going to be shared across Wasmtime and wasmparser it might be best to leave this to ahash for now and add it here if necessary in the future

@alexcrichton alexcrichton added this pull request to the merge queue Apr 25, 2024
Merged via the queue into bytecodealliance:main with commit d1eeff9 Apr 25, 2024
18 checks passed
@alexcrichton alexcrichton deleted the nostd branch April 25, 2024 18:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants