Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows file system shims are missing #3482

Open
RalfJung opened this issue Apr 18, 2024 · 22 comments
Open

Windows file system shims are missing #3482

RalfJung opened this issue Apr 18, 2024 · 22 comments
Labels
A-files Area: related to files, paths, sockets, file descriptors, or handles A-shims Area: This affects the external function shims A-windows Area: affects only Windows targets C-project Category: a larger project is being tracked here, usually with checkmarks for individual steps

Comments

@RalfJung
Copy link
Member

RalfJung commented Apr 18, 2024

Miri supports accessing the file system (opening, reading, writing files; listing directories) on Unix systems, but not on Windows. This is probably the biggest remaining gap in our Windows support.

If you run into this, try cargo miri test --target x86_64-unknown-linux-gnu. This works even on a Windows host (without installing anything extra) and interprets programs on a well-supported Linux target -- file system access should generally work then. If it doesn't please file an issue (do not comment on this issue, instead file a new one).

Nobody on the Miri team is very knowledgeable in Windows APIs, and a reasonable work-around exists for Windows users, so this is unlikely to be on our agenda any time soon. Contributions are always welcome though. :)

@RalfJung RalfJung added A-shims Area: This affects the external function shims A-windows Area: affects only Windows targets labels Apr 18, 2024
@RalfJung RalfJung added the C-project Category: a larger project is being tracked here, usually with checkmarks for individual steps label Apr 18, 2024
@mbyt

This comment was marked as off-topic.

@RalfJung

This comment was marked as off-topic.

@mbyt
Copy link

mbyt commented Apr 19, 2024

Thanks again for your fast response.

What I currently do, is that I run the complete unittest suite on windows and skip the tests which (i) access the filesystem or (ii) a foreign function. This works actually quite well and thanks to miri I already found a few places of undefined behavior. That's marvelous!

With this ticket I could also run the tests which access the filesystem and would only need to skip the tests with the foreign function.

Regarding external C dependencies: Is this the topic of #2365?

@RalfJung
Copy link
Member Author

RalfJung commented Apr 19, 2024

Okay, so the problem is getting the build scripts to shut up and do nothing so that you can use --target x86_64-unknown-linux-gnu. On Windows the external dependencies will be built and linked even in Miri, which is useless but harmless. With a cross-target then it is not harmless any more, it causes the build to fail. I can see how disabling those build scripts could become tricky. Might still be simpler than implementing the Windows FS shims though. ;)

Regarding external C dependencies: Is this the topic of #2365?

Yes.

This works actually quite well and thanks to miri I already found a few places of undefined behavior. That's marvelous!

Awesome. :)

If these bugs have a public track record / issue / PR somewhere, we always appreciate PRs that add more bugs to the "bugs found with Miri" list in the readme. :D

@mbyt
Copy link

mbyt commented Apr 22, 2024

Thanks again for your fast response and help. This is really a great project!

If these bugs have a public track record / issue / PR somewhere, we always appreciate PRs that add more bugs to the "bugs found with Miri" list in the readme. :D

Unfortunately the issues are in the closed source components, thus I cannot share. However, if interesting for you here the error messages of the discovered errors:

  • error: Undefined Behavior: accessing memory based on pointer with alignment 1, but alignment 4 is required
  • error: memory leaked: alloc288778 (C heap, size: 824, align: 16), allocated here:
  • error: casting references to a bigger memory layout than the backing allocation is undefined behavior, even if the reference is unused

@RalfJung RalfJung added the A-files Area: related to files, paths, sockets, file descriptors, or handles label May 5, 2024
@cgettys-microsoft
Copy link
Contributor

I'm also interested in this and might be willing to take a crack at adding support. @RalfJung, maybe you could clarify what the main challenge to doing this? I'm not asking you guys to do it, just seeking to understand what I'd be getting myself into.

Is it cross-emulation? E.g. emulating the behavior of Windows FS apis on Linux?
Is it having an in-depth enough knowledge of the Windows APIs involved to be able to well specify what would be UB when using them or when operating on their results (e.g. what return codes tell you that a buffer you passed into the function is now initialized memory, or how much of the buffer is initialized)?
Is it sheer volume of the number of APIs that need to be shimmed?

Naively, it seems like -Zmiri-disable-isolation support wouldn't be all that tricky for someone with a decent degree of Windows API knowledge and enough time on their hands - many, many shims to add to src\shims\windows\foreign_items.rs?
But I'm pretty sure I'm missing something, even reading through the source and trying to understand what the existing unix fs handling is doing.

@saethlin
Copy link
Member

Naively, it seems like -Zmiri-disable-isolation support wouldn't be all that tricky for someone with a decent degree of Windows API knowledge and enough time on their hands - many, many shims to add to src\shims\windows\foreign_items.rs?

Both Windows API knowledge and free time is in very short supply around here.

@RalfJung
Copy link
Member Author

RalfJung commented Jun 21, 2024 via email

@cgettys-microsoft
Copy link
Contributor

cgettys-microsoft commented Jun 21, 2024

I wouldn't personally call myself a Windows API expert, but I definitely know folks who have expertise. But no promises as is often the case RE time :(.

Even the fraction that std uses is pretty comprehensive. Ex: WriteFile vs NtWriteFile, and ReadFileEx vs NtReadFile - looks like pipes for std::process use one and std::fs used the other. And so on.

It should be possible to do in terms of Rust std:: apis, I think anyway.

Here's some Microsoft docs that are relevant and publicly available (in case anyone finds them interesting / in case I don't have a chance to attack this).
https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/asplos2011-drawbridge.pdf
https://www.microsoft.com/en-us/sql-server/blog/2016/12/16/sql-server-on-linux-how-introduction/
Clearly the WSL guys have managed bi-directional interop, not just one direction. Driver called DrvFs handles at least one direction?
https://learn.microsoft.com/en-us/windows/wsl/wsl-config
But none of these go into too much detail / at the API call level, nor are any of these OSS as far as I know.

@saethlin
Copy link
Member

The std::process APIs are not very interesting for Miri yet. Not only is spawning other processes unsupported on Linux because we would need many more shims, it's not even clear how that would be implemented. Would we spawn a new interpreter? IPC would get weird.

@cgettys-microsoft
Copy link
Contributor

Ah, yeah, good point. Nor is std::process something I personally need.

Yeah, if it's mostly sync I/O without the fancy stuff (APC or IOCP or similar), maybe it's not too bad. Still a significant number, but tractable, with this presumably being most of the list.
https://github.com/rust-lang/rust/blob/c1b336cb6b491b3be02cd821774f03af4992f413/library/std/src/sys/pal/windows/fs.rs#L1064.

@ChrisDenton
Copy link
Member

We do have some odd bits in the standard library. For example, we need to guard synchronous read/write functions against asynchronous file handles, which are safe to create. And maybe don't even look at remove_dir_all until last, ha.

But if you do have any questions about the standard library then feel free to ping me.

@beepster4096
Copy link
Contributor

beepster4096 commented Jun 22, 2024

And maybe don't even look at remove_dir_all until last, ha.

Maybe don't ever look at it. On Unix, std uses the fallback impl of it when running in miri.

@RalfJung
Copy link
Member Author

On Unix, std uses the fallback impl of it when running in miri.

I hope we'll get rust-lang/rust#120426 at some point and then we can implement more POSIX APIs in Miri on top of that and start using the real impl in Miri... but anyway, that's off-topic here. :)

@CraftSpider
Copy link
Contributor

I'm working on this in my free time the past couple of days, and have... 6/13 functions of the shims/fs.rs test working so far. Will open a PR once everything is at least basically implemented.

@RalfJung
Copy link
Member Author

Ideally make the first PR as small as possible, that will make everything easier. :) So if there's a subset of functions that can make a very simple test case work, please go for just that subset in the first PR.

@CraftSpider
Copy link
Contributor

Ah, if that's preferred, I can try to split off just the bare-minimum open/meta/read/write/close support when I have some time. That lets us do the most simple and common of file operations.

@RalfJung
Copy link
Member Author

Yes that sounds like the better plan -- thanks. :)

@cgettys-microsoft
Copy link
Contributor

cgettys-microsoft commented Nov 21, 2024

Glad to hear that someone is working on this - apologies that I didn't get to it.

If it'd be helpful to have folks with Windows Filesystem API knowledge help review the PRs to add Windows shims, happy to help find subject matters experts from Microsoft to help.

CC'ing Arlie Davis (@sivadeilra) as well - as he's probably forgotten more about Windows APIs than I've ever known 😄.

@sivadeilra
Copy link

I'm happy to help, to the degree that I can. Are the PRs just these tree? (#4046, #4045, #4043)

@CraftSpider
Copy link
Contributor

Currently, those three are the only ones up. I have more implemented locally, but I'm focusing on getting minimal support upstreamed before I continue with more in-depth support. I can try to remember to tag you on future PRs I open for this project if you'd like.

@CraftSpider
Copy link
Contributor

First actual API additions now open: #4067
After that merges, #4043 will be only a couple new shims and some stdin/stdout handling cleanup. After that, it's just a matter of fleshing out the API surface, so more of the shims/fs.rs methods work on Windows. I already have a number of them working locally, I'll open PRs after the basics are in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-files Area: related to files, paths, sockets, file descriptors, or handles A-shims Area: This affects the external function shims A-windows Area: affects only Windows targets C-project Category: a larger project is being tracked here, usually with checkmarks for individual steps
Projects
None yet
Development

No branches or pull requests

8 participants