Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert preopen initialization to be lazy. #408

Merged
merged 2 commits into from
May 3, 2023

Conversation

sunfishcode
Copy link
Member

Insteead of eagerly initializing the preopens in a static constructor, perform preopen initialization the first time it's needed, or before a close or a renumber which might disrupt the file descriptor space.

And, use a weak symbol with a stub function for use by close or fd_renumber, we that they can trigger preopen initialization only if it's actually needed.

This way, if a program doesn't contain any calls to any function that needs preopens, it can avoid linking in the preopen initialization code. And if it contains calls but doesn't execute them at runtime, it can avoid executing the preopen intiailization code.

A downside here is that this may cause problems for users that call __wasi_fd_close or __wasi_fd_renumber directly and close over overwrite some preopens before libc has a chance to scan them. To partially address this, this PR does add a declaration for __wasilibc_populate_preopens to <wasi/libc.h> so that users can call it manually if they need to.

Insteead of eagerly initializing the preopens in a static constructor,
perform preopen initialization the first time it's needed, or before a
close or a renumber which might disrupt the file descriptor space.

And, use a weak symbol with a stub function for use by `close` or
`fd_renumber`, we that they can trigger preopen initialization only
if it's actually needed.

This way, if a program doesn't contain any calls to any function that
needs preopens, it can avoid linking in the preopen initialization code.
And if it contains calls but doesn't execute them at runtime, it can
avoid executing the preopen intiailization code.

A downside here is that this may cause problems for users that call
`__wasi_fd_close` or `__wasi_fd_renumber` directly and close over
overwrite some preopens before libc has a chance to scan them. To
partially address this, this PR does add a declaration for
`__wasilibc_populate_preopens` to <wasi/libc.h> so that users can call
it manually if they need to.
@syrusakbary
Copy link

I think this defeats the propose of how preopens were designed in cloudabi? (for which WASI was designed upon)

Basically, if this gets merged, a malicious program can be designed to inspect what are the preopened dirs (and attack them with other vectors) using timing strategies.
Did I misunderstand that or can we assure there will no side-effects regarding security if the PR gets merged?

@sbc100
Copy link
Member

sbc100 commented Apr 14, 2023

I think this defeats the propose of how preopens were designed in cloudabi? (for which WASI was designed upon)

Basically, if this gets merged, a malicious program can be designed to inspect what are the preopened dirs (and attack them with other vectors) using timing strategies. Did I misunderstand that or can we assure there will no side-effects regarding security if the PR gets merged?

I don't think I understand. Can you explain a bit more? It seems like its up to the wasm module if and when it performs the pre-opens.. why would delaying (or even simply never performing) the pre-open syscalls make any difference? In the threat model you are referring to where is the malicious code running? And what is the vulnerable code?

return;
}

LOCK(lock);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this work with _REENTRANT?
this function calls internal_register_preopened_fd which tries to acquire the same lock.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, good catch. I've now fixed that.

@syrusakbary
Copy link

Can you explain a bit more? It seems like its up to the wasm module if and when it performs the pre-opens.. why would delaying (or even simply never performing) the pre-open syscalls make any difference?

Basically, the main reason is timing attacks. That's why in secure places rather than comparing strings you usually compare hashes of them (because they are constant in time no matter the length of the string).
If you don't precompute the preopens it can become obvious to an attacker which are the preopened directories. That by itself is not an issue per se, but it can become one very quickly

@sbc100
Copy link
Member

sbc100 commented Apr 16, 2023

Can you explain a bit more? It seems like its up to the wasm module if and when it performs the pre-opens.. why would delaying (or even simply never performing) the pre-open syscalls make any difference?

Basically, the main reason is timing attacks. That's why in secure places rather than comparing strings you usually compare hashes of them (because they are constant in time no matter the length of the string). If you don't precompute the preopens it can become obvious to an attacker which are the preopened directories. That by itself is not an issue per se, but it can become one very quickly

IIUC the pre-opening actually happens in on the host side, via command line flags such as --dir= in wasmtime. Whether or when the wasm module decides to enumerate, or access those preopened files is completely up the userspace program, isn't it? I don't think we can or should try to mandate if or when a program chooses to access them.

Some programs might choose never to access the re-opened files or directories at all. IIUC wasi-libc already tries to completely avoid pre-opening for programs that don't use file access at all. Why include that that code at in programs that don't actually use files, right?

Furthermore, I don't think that the design of the pre-open system had anything to do with timing attacks, its all about providing a way to pass the initial file handles to a program in system that doesn't support open directly (only openat, which requires at least an initial directory fd). I could be wrong though, do you have any evidence to suggest that this was part of the design?

@yamt
Copy link
Contributor

yamt commented Apr 17, 2023

I think this defeats the propose of how preopens were designed in cloudabi? (for which WASI was designed upon)

Basically, if this gets merged, a malicious program can be designed to inspect what are the preopened dirs (and attack them with other vectors) using timing strategies. Did I misunderstand that or can we assure there will no side-effects regarding security if the PR gets merged?

what exactly does such a malicious program want to measure the timing of?
if it's wasi api like fd_prestat_dir_name, such a malicious program can happily call it by itself to measure, regardless of this PR, can't it?

Factor out the lock acquisition from the implementation of
`internal_register_preopened_fd` so that we can call it from
`__wasilibc_populate_preopens` with the lock held.
@sunfishcode
Copy link
Member Author

Indeed, this PR isn't doing anything that malicious code can't already do. Are there any further comments here?

@yamt
Copy link
Contributor

yamt commented Apr 28, 2023

lgtm except that i don't understand syrusakbary's security concern.

@sunfishcode
Copy link
Member Author

If there are any further concerns, please file an issue.

@sunfishcode sunfishcode merged commit a6f8713 into main May 3, 2023
@sunfishcode sunfishcode deleted the sunfishcode/lazy-preopens branch May 3, 2023 19:44
@achille-roussel
Copy link

Hello @sunfishcode !

I've been working on integrating WASI in Go recently, and I am wondering whether we should replicate this behavior in Go. My thinking is that the more guests behave similarly across languages, the more likely they are to avoid subtle bugs when interacting with the host.

Would you be able to share more details about the reasons that motivated this change?

@sunfishcode
Copy link
Member Author

Not all programs need preopens, so this change allows programs that don't use them at all to avoid linking them in, and programs that don't use them dynamically to avoid calling the preopen initialization functions.

It also potentially helps us with a broader question about how C++-style static constructors should work in Wasm, because previously the preopens were initialized with such a constructor, making it one of the more common reasons why many programs had a constructor in them. With this change, many programs can avoid having any constructors. Some still will, but if it's a small number, we may have more options.

@achille-roussel
Copy link

achille-roussel commented May 11, 2023

Thanks for sharing the context!

It sounds like for the case of Go, it would be really rare to have programs that don't need to deal with preopens, but maybe will be useful for TinyGo!

abrown added a commit to abrown/wasi-libc that referenced this pull request Apr 26, 2024
Issue [#8392] in Wasmtime highlights how preopen registration can result
in a hang when compiled for a threaded environment. The problem is that
`internal_register_preopened_fd_unlocked` assumes it will not touch the
global lock because its caller, `__wasilibc_populate_preopens`, already
has taken the lock. Unfortunately, a refactoring in WebAssembly#408 (which
introduces `internal_register_preopened_fd_unlocked`) did not catch that
the `resize` function called internally also takes the global lock. This
change removes that locking in `resize` under the assumption that it
will only be called when the lock is already taken.

[#8392]: bytecodealliance/wasmtime#8392
abrown added a commit that referenced this pull request May 2, 2024
Issue [#8392] in Wasmtime highlights how preopen registration can result
in a hang when compiled for a threaded environment. The problem is that
`internal_register_preopened_fd_unlocked` assumes it will not touch the
global lock because its caller, `__wasilibc_populate_preopens`, already
has taken the lock. Unfortunately, a refactoring in #408 (which
introduces `internal_register_preopened_fd_unlocked`) did not catch that
the `resize` function called internally also takes the global lock. This
change removes that locking in `resize` under the assumption that it
will only be called when the lock is already taken.

[#8392]: bytecodealliance/wasmtime#8392
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants