Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Append workspace paths using components #8874

Closed
wants to merge 2 commits into from
Closed

Append workspace paths using components #8874

wants to merge 2 commits into from

Conversation

DJMcNab
Copy link

@DJMcNab DJMcNab commented Nov 20, 2020

This improves the handling when cargo is run on windows using
a UNC path as its working directory.

This error was met working on EmbarkStudios/rust-gpu#239.

An alternative implementation which would also be sufficient for rust-gpu would be to only append the new path's components, and keep the same format for root_dir.

This could also be limited to the expanded_list.push(pathbuf); branch, if preferred, although this would prevent rust-gpu from using glob paths at all, which would be unfortunate.

See also:

This improves the handling when cargo is run on windows using
a UNC path as its working directory.

See also:
 - #7986
 - rust-lang/rust#42869
@rust-highfive
Copy link

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Eh2406 (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 20, 2020
@Eh2406
Copy link
Contributor

Eh2406 commented Nov 20, 2020

For past discussion see #8626 and #6198 and dtolnay/request-for-implementation#34. I think fixing all the things will be hard. But,

This looks like a small reasonable change. If you can add a test that demonstrates where this helps, we my be up for merging this incremental improvement.

@alexcrichton
Copy link
Member

I'm a bit confused as to why .join isn't working here because I feel like we use that a lot in other parts of Cargo. If join is doing something surprising then the bug might be in join itself and not something we should work around in just one location in Cargo?

@DJMcNab
Copy link
Author

DJMcNab commented Nov 20, 2020

So join is implemented using PathBuf::push, which itself is implemented using OsString::push

This therefore does not change seperators, and since UNC paths (intentionally) do not handle /, this is skipped.

The below is mostly speculative, but I think .join in its current form is WAI.

There is a world where Path::joining a/b to the end of a UNC path makes sense, such as to interact with an insane filesystem which allows / to be used in a path; in general it's impossible to know which join was intended. However here, the cargo workspace format is defined to use /s as folder seperators only, and filesystems where / is not a folder seperator and instead part of a folder name do not have to be supported.

I do agree an upstream convenience method for this (more common imo) case would be nice though, but this isn't a bug in std. It's worth noting that most of the other uses of join in cargo are valid, because they only ever use single segment paths (e.g. .join(Cargo.toml)) or are part of tests where running them with a UNC working directory is unpermitted.

@dylni
Copy link

dylni commented Nov 20, 2020

@alexcrichton

I'm a bit confused as to why .join isn't working here because I feel like we use that a lot in other parts of Cargo. If join is doing something surprising then the bug might be in join itself and not something we should work around in just one location in Cargo?

join doesn't behave in the expected way for some Windows paths, as retep998 explains here. That's why I created normpath, and I plan to integrate it into Cargo to solve these issues. It has its own join method that should always behave correctly.

Even this PR isn't 100% correct, but it should give better results. Once I have time to create a PR to integrate normpath, this issue should be fixed completely.

@dylni
Copy link

dylni commented Nov 20, 2020

Reading through it again, I think this PR would give incorrect results when glob starts with C:\. It wouldn't treat it as an absolute path.

Ignore this.

@alexcrichton
Copy link
Member

This sort of sounds like a bug in std::path to me? If libstd is blindly copying the results of path.join without care for the separator that seems like a bug. I would expect join to be equivalent to the code in this PR, for example.

@DJMcNab
Copy link
Author

DJMcNab commented Nov 23, 2020

The problem is that "add a/b to UNC path c" semantically means "add an a/b component at the end of c", because a/b is a valid component in a UNC path. However specifically in cargo, a / in the workspace members key is always a folder seperator, and we explicitely do not support paths with a / component.

This just aligns the implementation with the intended semantics.

I think the std semantics are correct, because otherwise std::path would be unusable for UNC paths with a / component (including all the places a std::path::Path is taken as an argument). However there should also be a way to specify 'I do not care about / not being a path seperator in UNC' semantics, which there currently isn't and the best method to do so is to use the approach taken in this PR.

@dylni
Copy link

dylni commented Nov 23, 2020

However specifically in cargo, a / in the workspace members key is always a folder seperator, and we explicitely do not support paths with a / component.

More specifically, Cargo assumes that config files use relative paths, which is reasonable. The current directory being a verbatim path shouldn't change how the config file path is interpreted. However, libstd's Path is a general-purpose path, so it has no information about how the joined path should be interpreted.

@dylni
Copy link

dylni commented Nov 23, 2020

I'm not against changing the behavior of Path::join, since I don't have any examples of when / instead of \ makes a difference. However, it would be a silent breaking change, which might be dangerous.

@ehuss
Copy link
Contributor

ehuss commented Nov 25, 2020

Just a random note, another option might be to convert all \\?\ prefixed paths to \\.\. AFAIK, the \\.\ prefix is the same, except the Win32 APIs perform canonicalization. That means relative components . and .. will be handled, as well as converting forward slashes to backslashes (among other normalizations). This would allow Cargo to naively use .join("foo/bar") relative paths and still work correctly.

@DJMcNab
Copy link
Author

DJMcNab commented Nov 25, 2020

That is a very good point

That option is documented here

I'm not sure how to do it using the stdlib APIs, without making it rather janky

@DJMcNab
Copy link
Author

DJMcNab commented Nov 25, 2020

@ehuss I have tried to use that, and unfortunately the url crate doesn't support that (https://docs.rs/url/2.2.0/src/url/lib.rs.html#2551). For some reason cargo needs to create a URL, and that fails. This patch is still required for the planned rust-gpu build system to be usable on windows.

@XAMPPRocky
Copy link
Member

XAMPPRocky commented Dec 1, 2020

It seems like discussion has stalled, but I would also like to voice that I would very much like for this fix to be merged, as there is no real workaround available.

Having this fixed in std::path directly would be great, but as others have mentioned there isn't a simple fix for Path, and there probably needs to be a broader discussion on the correct behaviour of Path::join on Windows.

I don't think that having that discussion should preclude cargo from fixing the couple of instances where it's clearly wrong though. Especially when fixes for Windows APIs in std often take a very long time to be fixed. (There is still no patch for fs::remove_dir_all, and it's been several years since people knew that was broken.)

@Eh2406
Copy link
Contributor

Eh2406 commented Dec 3, 2020

I do not object to this change. If the conversation has not come to a plan I will put it on the agenda at the next cargo meating.

@dylni
Copy link

dylni commented Dec 7, 2020

@Eh2406 Can you also discuss this related PR? It hasn't had much activity.

@joshtriplett
Copy link
Member

joshtriplett commented Dec 8, 2020

I think there are two cases here, and I think Cargo should handle one of them and not the other.

Cargo uses / in manifest paths on all platforms. Cargo should not assume that it's acceptable to do some_path.join("a/b") on every platform and expect / to get translated to a path separator. That doesn't seem like a reasonable assumption for portable code to make, and this issue is an example of why. We're expecting paths in manifests to use / in paths, but that means it's our responsibility to translate such paths to the platform convention. It's Cargo's responsibility to translate the UNIX-style paths in manifests to Path or PathBuf values in a portable fashion.

That said, once Cargo has translated to a Path or PathBuf, I don't think it should be Cargo's responsibility to deal with Path/PathBuf values that std cannot join properly. path.join(other_path) should suffice; it should not be necessary to manually split into components when joining paths.

The practical effect of this distinction: I think we should parse manifest paths into platform PathBuf values when parsing the manifest, but I do not think we should have any special handling of such values afterwards. If there's a bug in joining PathBuf values, every individual Rust program should not have to work around that.

(We could likely catch most such cases if we could somehow get a message for every call we made to Path::join with a String argument.)

@DJMcNab
Copy link
Author

DJMcNab commented Dec 8, 2020

For implementing that, then our options would be to add a custom deserialise for the PathBufs in

members: Option<Vec<String>>,
#[serde(rename = "default-members")]
default_members: Option<Vec<String>>,
exclude: Option<Vec<String>>,

which properly normalises the / to \ on windows. Alternatively, adding a custom serde adapter somewhere upstream, then using that.
The current deserialise impl is:
https://github.com/serde-rs/serde/blob/e7974312681c9c1c54f7586e86b0d7013b350a54/serde/src/de/impls.rs#L1577
for the record.

Both of these are more work than I'm able to put in just at the moment. I think @XAMPPRocky is on holiday for a while, so it's not urgent for the rust-gpu use case.

@ehuss
Copy link
Contributor

ehuss commented Dec 8, 2020

I don't think normalizing / and \ is sufficient. \\?\ paths do not support .. in paths either, and it is quite common for path dependencies to use ...

My suggestion above to use \\.\ instead is probably not a good one, since those paths do not support long path names, which defeats one of the big benefits of \\?\. Also, Microsoft recommends you do not use them for normal files.

@DJMcNab, can you clarify what the actual problem was? Cargo should work fine with UNC paths (that is, network paths of the form \\SERVER\SHARE\foo\bar). Did you maybe mean device paths with the \\?\ prefix? Microsoft is not very consistent in what they call \\?\ (variously called "DOS device paths" or "NT namespace"), but it is distinctly different from UNC which is explicitly defined as just a network share.

Overall, I'd be reluctant to add a small hack like this, when there are many other places in Cargo where a \\?\ working directory will cause problems.

@DJMcNab
Copy link
Author

DJMcNab commented Dec 9, 2020

Basically this is just the minimum viable hack to make cargo possibly work with a working directory longer than MAX_PATH, in the manner required for rust-gpu. In this case, that requires using a \\?\ working directory.
The build system experiment for rust-gpu requires this because it needs to build a rust sysroot in a build.rs file, and the only information about available scratch dirs is OUT_DIR, but cargo fails both if we use a non-canonicalised cwd in OUT_DIR, with obscure errors, which are because of MAX_PATH. An alternative 'fix' would be for cargo to change where OUT_DIR is, and make it a shorter path.

This explicitly does not try to handle any more complex cases, that is the workspace setup in rust-gpu does not require the use of ..

@dylni
Copy link

dylni commented Dec 10, 2020

Overall, I'd be reluctant to add a small hack like this, when there are many other places in Cargo where a \\?\ working directory will cause problems.

If this is an issue, I created #8964 as a possible alternative. It fixes this issue too, and it's the first step of changing the type of root_dir to BasePathBuf, which would fix this issue in some other locations.

@ehuss
Copy link
Contributor

ehuss commented Aug 6, 2021

Closing as this PR is quite old, and unfortunately it isn't entirely clear on how to proceed with fixing the Windows paths issues (and unfortunately none of us have much time to invest in this). I have opened #9769 as a meta issue to cover this. I understand this can be a frustrating issue to deal with, and this isn't saying we don't want it fixed (we very much do!), but that we want to better understand how to proceed. Thanks!

@ehuss ehuss closed this Aug 6, 2021
@DJMcNab
Copy link
Author

DJMcNab commented Aug 6, 2021

I've been considering closing this myself for a while - just hadn't gotten around to it yet.

As far as I know, rust-gpu has worked out a 'better' (for some definition thereof) workaround.

@DJMcNab DJMcNab deleted the component-wise-join branch August 6, 2021 19:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants