Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Submit repo-relative readme path relative to crates.io #9837

Closed
wants to merge 1 commit into from

Conversation

nipunn1313
Copy link
Contributor

Ideally it's a repo-relative readme path, but this seems
to be the closest we can get. It's typical to have a workspace
be at the repo root, so it seems like an improvement at least.

Crates.io (currently) uses the readme path to determine its filetype.
With this, crates.io would
also use it to render relative links in non-root READMEs. This diff would allow
it to correctly render relative links in READMEs from non-root Cargo.toml

Added unit tests for publishing with

  • cargo.toml in root + readme in root
  • cargo.toml in root + readme in subdir
  • cargo.toml in subdir + readme next to it
  • cargo.toml in subdir + readme in root (../README)

Should fix rust-lang/crates.io#3484 in
combination with rust-lang/crates.io#3861

@rust-highfive
Copy link

r? @alexcrichton

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 24, 2021
@nipunn1313
Copy link
Contributor Author

Here's my screenshot of an e2e test where I use my local cargo (w/ this patch) to cargo publish to my local crates.io with rust-lang/crates.io#3861

As you can see, the relative link to Otherfile.md is appropriately inside the innercrate/docs directory! (statusquo is a broken link to just README)

image

@nipunn1313
Copy link
Contributor Author

I realized I could just do repo-relative paths via git2::discover() - so I did that!
What's the recommended way to get this working on windows? Convert a PathBuf to a fwd-slash path (since that's I think what crates.io expects).


let readme_path_relative_to_repo = readme.as_ref().map(|readme| {
|| -> Option<String> {
let repo = git2::Repository::discover(pkg.root()).ok()?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this ever problematically discover too high up (eg maybe user made their whole root a git directory, and are working on a cargo project that isn't in git)? Is there some way to bound this discovery? Maybe we don't care so much for that case. It seems very contrived.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see now my comment below is related to this. tl;dr; we already do some form of git repo discovery so this should match that (either by threading info or with a helper function). I think this is a good question to ask but it's a preexisting one you don't need to solve here.

@nipunn1313 nipunn1313 changed the title Submit ws-relative readme path relative to crates.io Submit repo-relative readme path relative to crates.io Aug 24, 2021
"links": null,
"name": "foo",
"readme": "HELLO",
"readme_file": "docs/README",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will probably end up failing on windows with a backslash here? I think we'd want to send forwardslashes to crates.io here. How to convert?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With green tests I think this worked out?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I need to write a new test to exercise this situation - where the Cargo.toml is not in the root of the repo (eg a repo with multiple crates).

I would expect it to come up with something like
path\to\innercrate\docs/README as is

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok that makes sense, I think it's fine to specifically change \ to / with a replace call in an ad-hoc fashion since it's purely for this one field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. I was unable to get the tests to actually cover this because they use weak json matching (that converts \ to /). I made the replace change and manually confirmed it works with some println's in the compare.rs to make sure.

I wonder if there's somewhere we can put some assert in the test implementation of the registry. I was having trouble tracking down what cargo publish does in the unit tests - maybe the test backend can assert for fwd slashes only? Do you know where to look for that?

@alexcrichton
Copy link
Member

Seems reasonable to me to add, thanks for this!

For finding the git repo, IIRC we do that somewhere in the packaging step to get the head revision to encode in some metadata we upload as well. Instead of rediscovering the repository twice could the repo information get threaded out of there into this transmission step? I'm slightly worried that the parent git repo isn't always guaranteed to be the project repo so we may add heuristics one day to improve that logic, and if we do so we should ideally only do that in one location.

@nipunn1313 nipunn1313 force-pushed the readme_path branch 2 times, most recently from 167f947 to 32a8c7a Compare August 28, 2021 05:28
Added unit tests for publishing with
- cargo.toml in root + readme in root
- cargo.toml in root + readme in subdir
- cargo.toml in subdir + readme next to it
- cargo.toml in subdir + readme in root (../README)

Should fix rust-lang/crates.io#3484 in
combination with rust-lang/crates.io#3861
@nipunn1313
Copy link
Contributor Author

For finding the git repo, IIRC we do that somewhere in the packaging step to get the head revision to encode in some metadata we upload as well. Instead of rediscovering the repository twice could the repo information get threaded out of there into this transmission step? I'm slightly worried that the parent git repo isn't always guaranteed to be the project repo so we may add heuristics one day to improve that logic, and if we do so we should ideally only do that in one location.

Started doing this and realized it was a bit of a refactor - so I split it out into its own PR - #9851. That PR depends on this one - so github's diff view isn't great about only showing the differences. But if this one approves and lands, I think github should update #9851 to only show the refactor portion.

@alexcrichton
Copy link
Member

@bors: r+

@bors
Copy link
Contributor

bors commented Aug 30, 2021

📌 Commit bb7a63f has been approved by alexcrichton

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 30, 2021
@ehuss
Copy link
Contributor

ehuss commented Aug 30, 2021

@bors r-

Sorry, I'd like to make some comments here.

@bors bors added S-waiting-on-author Status: The marked PR is awaiting some action (such as code changes) from the PR author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Aug 30, 2021
@ehuss
Copy link
Contributor

ehuss commented Aug 30, 2021

Has #5911 or #8434 been considered here? Would this make it more difficult for other registries to fix the relative links?

cc @kornelski, can you comment on whether or not lib.rs would be able to make use of this or have problems?

At a minimum, the documentation needs to be updated here.

Do you know if it will break other registries? This is also somewhat of a breaking change, since it is changing the meaning of readme_file.

This seems to make readme_file contain paths that don't exist in the .crate file, making them possibly even more difficult to handle.

@kornelski
Copy link
Contributor

kornelski commented Aug 30, 2021

Making readme_file point to a file that is not in .crate tarball is a breaking change. I'd prefer to keep this field as-is.

How about extending .cargo_vcs_info.json to have a key like path_in_repository?

For lib.rs I'm cloning the repo URL from repository Cargo.toml field, and recursively search the cloned repo for the crate in order to discover the crate's path inside the repo. If you could provide the path in repo via the crate tarball it'd be very helpful.

@nipunn1313
Copy link
Contributor Author

Hey! Thanks @ehuss for linking those issues. It didn't come across my mind that there are other registry implementations.

I had read through the crates.io implementation, and it only uses the extension of the readme file - so it wouldn't have an effect there. Crates.io doesn't really use the full path. However, if this is an API change used by multiple registry implementations, then it is in fact a breaking change.

rust-lang/crates.io#3861 would get crates.io to rewrite the relative links.

It looks like @kornelski had some nice writeup in #8434 which overlaps rust-lang/crates.io#3484.
I'll leave comments in #8434 to keep them together

@nipunn1313
Copy link
Contributor Author

How about extending .cargo_vcs_info.json to have a key like path_in_repository?

For lib.rs I'm cloning the repo URL from repository Cargo.toml field, and recursively search the cloned repo for the crate in order to discover the crate's path inside the repo. If you could provide the path in repo via the crate tarball it'd be very helpful.

I implemented this idea in #9866. It looks pretty elegant - and backward compatible!

@nipunn1313
Copy link
Contributor Author

going to close in favor of #9866

@nipunn1313 nipunn1313 closed this Sep 9, 2021
bors added a commit that referenced this pull request Sep 26, 2021
Support path_in_vcs as part of cargo_vcs_metadata

Depends on #9865 - this PR will look simpler once that one lands. I can't target my PR to the base branch of #9865 as an external contributor (since it's in my fork). For now just review the latest commit of this one.

This is a backward compatible change that is an alternative to #9837
@nipunn1313 nipunn1313 deleted the readme_path branch September 26, 2021 22:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: The marked PR is awaiting some action (such as code changes) from the PR author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Relative URLs in subdirectory readmes
6 participants