Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement fs::realpath #11734

Closed
wants to merge 1 commit into from
Closed

Conversation

alexcrichton
Copy link
Member

This can be used to strip all symlinks from a pathname. This is then used to
close #3632.

/// If this function encounters errors, it will raise the error on the
/// `io_error` condition and the original path argument is returned.
pub fn realpath(original: &Path) -> Path {
static MAX: uint = 256;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be a constant defined in libstd/num or something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this is indeed a terrible name for this constant. I meant for this to be the maximum number of symlinks which will get resolved. I'll update this.

}
};
let original = os::make_absolute(original);
let mut result = Path::new("/");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not going to behave properly for Windows paths. I don't know if Windows even has symlinks, but if so, the root of the filesystem is not going to be /. You can use original.root_path() to get the root of the path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also worth noting that os::make_absolute may not actually return a fully absolute path on Windows. If the given path is a volume-relative path, e.g. C:foo, and the cwd is not from that volume, the resulting path will be the volume-relative path rather than an absolute path. There's been some discussion in #11579 as to whether volume-relative paths really are valid, but for the moment they are.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused. It looks like root_path returns option, so there's some failure modes to consider. On unix, I presume it'll never return None because make_absolute will work, but I don't understand the windows case at all.

If you have a volume-relative path, what's it relative to, if not the current directory? It could reference any file on the filesystem, but we opened the path from the OS and read the file so there's some path that it points to somewhere, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexcrichton The output of os::make_absolute().root_path() should always be a Some value. I say "should" because this assumes that getcwd() returns an absolute path (which it damn well should). The result of absolute_path.join(any_path) is a path that will return a Some value from .root_path() (and that's how os::make_absolute() works). Given that, it's perfectly fine to say let mut result = original.root_path().unwrap().

Volume-relative paths are not going to be usable with this function. Technically, they're relative to the current working directory for the specified volume. Classically, Windows maintained a separate working directory per-volume, and this has to do with how they supported old DOS programs that didn't have the notion of directories. There's a bunch of history in #11579 about this if you're curious.

Assuming we keep volume-relative paths (and I'm leaning towards doing so, though I still haven't thought through everything), I think the appropraite behavior here is something like

let original = os::make_absolute(original);
if !original.is_absolute() { return original.clone(); } // for Windows C:foo paths
let mut result = original.root_path().unwrap();
// ...

This way a volume-relative path will be returned unchanged.

@lilyball
Copy link
Contributor

r=me if the noted issues can be fixed.

@alexcrichton
Copy link
Member Author

r? @kballard

This seems tricky enough I'd like another pair of eyes before auto r+'ing

This can be used to strip all symlinks from a pathname.
@bill-myers
Copy link
Contributor

As far as I can tell, this is broken, because if a symlink points to a path that contains a symlink as an intermediate component, that will remain in the final path.

That is, realpath("/a") where /a is a symlink to "/b/c", /b is a symlink to "d": realpath will return "/b/c", but it should return "/d/c".

Also, you can't just come up with an invented magic number for the link limit: it must be exactly the same as the one used by the operating system, which you get by sysconf(_SC_SYMLOOP_MAX) on POSIX, and I'm not sure how on Windows.

Is implementing this by hand in Rust the best approach, as opposed to calling uv_realpath in librustuv and in libnative realpath on POSIX and something like CreateFile+GetFinalPathNameByHandle on Windows?

If so, I'd suggest reading the source code of realpath in glibc, copying what it does, and then testing that it works on Windows with both junctions and symbolic links (and if not, doing whatever is necessary there).

@alexcrichton
Copy link
Member Author

Looks like I just learned that realpath exists in C.

No need to reimplement in rust due to how tricky it is.

@lilyball
Copy link
Contributor

@bill-myers That is a very good collection of points.

I didn't even realize the symloop limit was configurable. Curiously, my manpage for sysconf() does not list _SC_SYMLOOP_MAX (this is OS X), although unistd.h does inded have it.

If libuv provides realpath functionality, using that seems like a good idea for libgreen. For libnative in POSIX, I agree that realpath() is a good solution. For Windows, I would suggest finding existing code that solves this problem (maybe even see what uv_realpath() does for Windows), because creating a file seems like a bad idea (or even opening one; resolving a path should work even if one does not have permission to read the file at the end of that path, as long as the ancestor directories are all readable).

@alexcrichton
Copy link
Member Author

Actually, it appears that the manpages for realpath don't recommend using it due to not knowing how large the output path can be. Additionally, libuv does not provide uv_fs_readlink. Finally, from a quick glance around, it looks like we're going to need to roll our own for windows anyway.

@alexcrichton alexcrichton reopened this Jan 27, 2014
@lilyball
Copy link
Contributor

FWIW, on OS X, the realpath source (note: apparently covered by the 3-clause BSD license, if you have any worries about reading the source) uses PATH_MAX. It's also quite complicated; it does some stuff with mountpoints, and has comments referencing chroot, so there's obviously some subtlety there we haven't even considered.

It also uses MAXSYMLINKS as the number of links it's willing to resolve, and that apparently has the low value of 32.

Given all this, it seems to me that we should use realpath() on POSIX and find some other existing solution for Windows instead of trying to write it ourselves. I'm also not sure what you mean by "libuv does not provide uv_fs_readlink". It certainly seems to exist in the libuv tree.

@bill-myers
Copy link
Contributor

The manpage also says "The resolved_path == NULL feature, not standardized in POSIX.1-2001, but standardized in POSIX.1-2008, allows this design problem to be avoided".

The function is usable as long as either PATH_MAX is defined, or you specify NULL for resolved_path, which results in libc allocating the path with malloc().

Not too sure on Windows, but I'd suggest to try to avoid writing your own at all costs.

Windows has short vs long file names, UNC paths, NT vs Win32 paths, two kinds of symbolic links (junctions and "symbolic links") plus pluggable reparse points with arbitrary behavior, and at least one among junctions and symbolic links, when it is on a network path, will refer to the path on the remote machine even if it is absolute.

@alexcrichton
Copy link
Member Author

I'll defer to your judgements, sounds like you guys know more about this area than I do.

@lilyball
Copy link
Contributor

Just filed an issue to track the fact that we need to find a solution for this.

@alexcrichton
Copy link
Member Author

Thanks @kballard

@alexcrichton alexcrichton deleted the issue-3632 branch January 27, 2014 23:18
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Apr 8, 2014
When calculating the sysroot, it's more accurate to use realpath() rather than
just one readlink() to account for any intermediate symlinks that the rustc
binary resolves itself to.

For rpath, realpath() is necessary because the rpath must dictate a relative
rpath from the destination back to the originally linked library, which works
more robustly if there are no symlinks involved.

Concretely, any binary generated on OSX into $TMPDIR requires an absolute rpath
because the temporary directory is behind a symlink with one layer of
indirection. This symlink causes all relative rpaths to fail to resolve.

cc rust-lang#11734
cc rust-lang#11857
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Apr 10, 2014
When calculating the sysroot, it's more accurate to use realpath() rather than
just one readlink() to account for any intermediate symlinks that the rustc
binary resolves itself to.

For rpath, realpath() is necessary because the rpath must dictate a relative
rpath from the destination back to the originally linked library, which works
more robustly if there are no symlinks involved.

Concretely, any binary generated on OSX into $TMPDIR requires an absolute rpath
because the temporary directory is behind a symlink with one layer of
indirection. This symlink causes all relative rpaths to fail to resolve.

cc rust-lang#11734
cc rust-lang#11857
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Symlinked rustc can't find sysroot
4 participants