-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should std::env::set_current_dir be unsafe? #37984
Comments
Any API that loads libraries at runtime and pulls symbols from them is inherently unsafe for a multitude of reasons unrelated to set_current_directory. |
@sfackler, loading from the current directory is safe if you know what this directory is because you just called The reference to library loading was an attempt at a formal argument towards lack of memory safety for |
You are always going to be asserting that some symbol is a function with a certain signature or a static of a certain type. In any case, the convention is to place the unsafety at the closest layer to where the actual badness could result. It is safe to get a raw pointer from a slice, but not safe to dereference it, for example. It is also unreasonable to make a function unsafe after it is already stable (and has been for over a year). |
@fweimer If you want to load a specific library, use an absolute path! I can't think of any time when you absolutely must use a relative path to the current directory. Get the current directory once when your process first starts up and then use absolute paths from then on. |
@sfackler Is it really impossible already to correct past mistakes? Would it make sense to add a warning to the documentation? Or do you disagree that the issue exists at all? |
The only way I could see |
@retep998 There is no common object to synchronize on, though. If a function receives a relative pathname (or gets it from the command line or the environment), it has to perform pathname resolution based on the current directory. If there is anything else in the process which temporarily alters the current directory for any purpose, this is not reliable and can give incorrect results. And because there is no obvious lock to take around the current directory change, it will always be risky to call this function. The conclusions I and others (Rich Felker in particular, with his library-safe functions concepts) is that library code cannot call |
@fweimer That's not memory unsafe. Rust has very specific rules about what is considered unsafe, and setting the current directory is not unsafe by those rules. It's a simple atomic operation to either get or set the current directory. As long as the safety of unsafe code never relies on the incorrect assumption that the current directory will never change, then everything is perfectly fine. If unsafe code ever does make that assumption, then it is that code's fault for doing so As I said before, it is relatively simple to avoid depending on the current directory. Anything that can take a relative path can also take an absolute path, so just stick to absolute paths everywhere. You can get the current directory once when your program first starts up, cache the result, and then every time you get a relative path as input you can just make it absolute using that current directory you cached. That way if some other code changes the current directory you won't be affected. |
@retep998 I was following this precedent: static mut var: u32 = 0;
fn main() {
var = 5;
} This does not compile:
This cannot cause memory safety violations on its own, either. |
It can (at least, the general feature of mutating statics) -- accesses from multiple threads cause data races, which cause UB, which can in turn cause anything including memory safety violations. |
Meanwhile |
A similar problem exists with |
@vadimcn, regarding I must say I find it rather strange that Rust considers writing to a |
@fweimer, concurrent modification of an int is still considered a UB by C/C++ and LLVM. |
There's several ways to concurrently modify an integer. Unsynchronized which is totally undefined behavior, behind a mutex which is fine, and using atomics which is also fine. For the current directory it's just a matter of figuring out which method it uses to do the modification. Since on Windows (and probably non-windows) it works with the current directory behind a mutex, it is therefore safe. If it did the modification without any synchronization, then that would be a huge issue, but it doesn't, so everything is fine. |
The WinAPI docs say it's unsafe to call from many threads. Is it actually locked as you describe?
|
@tbu- The current directory is stored in the Process Environment Block (PEB) and every time the current directory is get or set (or any part of the PEB is accessed), it acquires the |
@retep998 Which Windows versions did you test on? I'm not a fan of relying on undocumented behavior for memory safety. Microsoft could have trivially said that these functions were thread safe, but they explicitly documented that they were not. Anything can happen if |
@Zoxc |
This does sound like more than a race condition. I don't expect Microsoft to remove that lock though. Ideally we'd get them to document it. |
There are no data races in the implementation on Windows, and neither on the other platforms. Rust supports global mutable variables via atomics. I think this issue can be closed. |
We discussed this in the recent Libs meeting and the answer to the question that preempts this one: could It's been a few years since we've had activity here so I'll go ahead and close this one, but if any new information comes up we can open new, targeted issues for them. |
@tbu- wrote in 2017:
As of today, the current version of the SetCurrentDirectory documentation at https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-setcurrentdirectory?redirectedfrom=MSDN says:
The references to corruption have been removed. This is consistent with the locking behavior observed in comments above. My guess is that the locking wasn't originally there, and was added later. So there might be some version of Windows that doesn't have the locking. However, it might be the case that those versions are so old that Rust doesn't support them. Importantly, what matters is not whether there is locking on a particular version of Windows (whichever one we did the debugging on) but that the locking is present in all versions of Windows supported by Rust. (One might infer from the change to the Microsoft docs that all the versions of Windows that are supported by Microsoft do have the locking.) KodrAus wrote in 2020:
Currently we think I'm commenting on this here because |
I understand you're making a broader point on libs policy but I wanted to address this specific issue. There has been no change in behaviour here for either supported or unsupported versions of Windows. As the commit that amended the docs says:
The old documentation was overzealous in warning about the dangers of implicitly relying on the current directory to be unchanging in a multithreaded program. It was not meant to imply memory safety issues, hence the change in wording. In short, this was a documentation bug that has been fixed. |
Understood. One thing that isn't clear though, is how far back the MS team went to verify the change. I.e. is it their policy to consider versions that are no longer supported to have never existed (especially since, in practice, it is hard for any other policy to be a good use of time, as that's the whole purpose of dropping support)? I do feel some comfort in seeing that that commit was apparently authored by @oldnewthing. |
@briansmith The lock has been there since Win95 and NT 3.51. (My archives don't go back to NT 3.1 but I doubt the story in 3.1 is any different.) You are correct that when an old version of Windows is dropped, we stop updating the documentation for that obsolete version. For example, InterlockedIncrement originally returned only the sign of the result, but since NT 4.0, it has returns the incremented value. We stopped documenting the old behavior since all applicable versions of Windows are out of support. |
The current directory is a per-process resource (at least on POSIX). Changing it could affect what shared objects are loaded by a library, and therefore, there might be an indirect impact on memory safety. Should it be marked
unsafe
?The text was updated successfully, but these errors were encountered: