Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polish usage of the registry as a source #725

Merged
merged 15 commits into from
Oct 27, 2014

Conversation

alexcrichton
Copy link
Member

This PR contains a laundry list of improvements when using the registry as a source, and should be one of the last steps necessary for whipping cargo into shape to using the registry. Some of the highlights include:

  • All APIs have been updated to the registry's current interface
  • Lockfiles are now respected with registry sources
    • Conservatively updating dependencies should work
    • The network shouldn't be touched unless absolutely necessary
    • Lockfiles actually keep versions locked when using a newer registry with more versions
  • A new standalone lib was added for interoperating with the registry (HTTP-request-wise)
  • Packages are now verified before being published to the registry (this can be opted out of)
  • cargo upload was renamed to cargo publish

The commit series is intended to be individually reviewable and each commit should compile and pass all tests independently (but still needs to be applied in order).

@sfackler
Copy link
Member

Is there some way we could avoid having to duplicate the doc URL in the Cargo file and the crate root?

@alexcrichton
Copy link
Member Author

We could take the same strategy as rustc's notion of the crate name and metadata and add an argument to rustdoc which cargo will pass but most others normally won't pass. Something like --extern-url foo=bar

@alexcrichton alexcrichton force-pushed the registry-fixes branch 5 times, most recently from 6fea02f to d4c27f7 Compare October 23, 2014 20:15
@alexcrichton alexcrichton changed the title Update how cargo talks to the registry Polish usage of the registry as a source Oct 23, 2014
@alexcrichton
Copy link
Member Author

I've switched gears to basically fixing all my known issues about resolve, lockfiles, and the registry. I've added a bunch more commits.

@alexcrichton
Copy link
Member Author

I've also updated the title/description!

@alexcrichton alexcrichton force-pushed the registry-fixes branch 2 times, most recently from 2d0fa6f to d5b3a7b Compare October 27, 2014 16:21
@brson
Copy link
Contributor

brson commented Oct 27, 2014

Exciting work. r=me when you are satisfied you've addressed my comments. I tried to actually read everything but I'm fairly underqualified to really understand what's going on here.

This commit includes a laundry list of updates and tweaks to reflect the current
API of the registry:

* `registry.host` has been renamed to `registry.index`
* New top-level manifest keys are now accepted:
  * `homepage` - url
  * `documentation` - url
  * `repository` - url
  * `description` - a markdown-less blurb
  * `license` - string (verified by the registry on upload)
  * `keywords` - string array
  * `readme` - string pointing at a file
* Authors are now uploaded to the registry
* The upload format to the registry has changed to a body json payload
* Unpacking tarballs respects the executable bit for scripts and such.
* Downloading now follows redirects to go to S3.
* The download URL for a package has changed slightly.
* Verify path dependencies have a version listed when being uploaded
* Rename `upload` to `publish`
* Rename `ops::cargo_upload` to `ops::registry`
* Add a new `registry` package for interoperating with the registry
* Add the ability to modify owners via `cargo owner`
* Add a `readme` key to the manifest, and upload its contents to the registry.
* Add the ability to yank crates and their versions
* When packaging a library, verify that it builds from the packaged source by
  unpacking the tarball and simulate running `cargo build` inside of it.
This method shouldn't be necessary as it should be possible to simply add all
sources known in the lockfile to a package registry, and
core::{resolve, registry} should take care of weeding them out if necessary.

In the process of doing so, this actually ends up fixing a problem with
rewriting a dependency to a new one which shares transitive deps. Previously all
the transitive deps were updated, but now the transitive deps remain locked at
where they were previously locked.
In the normal case this isn't necessary due to the Registry dynamically
discovering sources when dependencies are queried for. Consequently there's no
need to register all these sources ahead-of-time, and it reduces the cognitive
load when thinking about sources and updates.
We have a hash map on the side and a method that already aborts re-adding a
source if it was already added.
This will hopefully become useful in the upcoming changes to resolve for
lockfiles with the registry.
Move functionality from cargo_fetch and cargo_generate_lockfile into dedicated
lockfile/resolve modules. The plan is to expand the resolve module significantly
to deal with the lockfile that's loaded.
This commit radically changes the approach of how lockfiles are handled in the
package registry and how resolve interacts with it. Previously "using a
lockfile" entailed just adding all of the precise sources to the registry and
relying on them not being updated to remain locked. This strategy does not work
out well for the registry, however, as one source can provide many many packages
and it may need to be updated for other reasons.

This new strategy is to rewrite instances of `Summary` in an on-demand fashion
to mention locked versions and sources wherever possible. This will ensure that
any relevant `Dependency` will end up having an exact version requirement as
well as a precise source to originate from (if possible). This rewriting is
performed in a few locations:

1. The top-level package has its dependencies rewritten to their precise
   variants if the dependency still matches the precise variant. This covers the
   case where a dependency was updated the the lockfile now needs to be updated.
2. Any `Summary` returned from the package registry which matches a locked
   `PackageId` will unconditionally have all of its dependencies rewritten to
   their precise variants. This is done because any previously locked package
   must remain locked during resolution.
3. Any `Summary` which points at a package which was not previously locked still
   has its dependencies modified to point at any matching locked package. This
   is done to ensure that updates are as conservative as possible.

There are still two outstanding problems with lockfiles and the registry which
this commit does not attempt to solve:

* Yanked versions are not respected
* The registry is still unconditionally updated on all compiles.
This is a bit of an implementation detail of the `SourceId` and we should be
able to get away without exposing it.
When updating a source with multiple packages, the registry will lazily discover
both the precise and imprecise versions of the source at some point. Previously
the source would be updated depending on which was discovered first, but this
commit adds logic to understand that if an imprecise version is discovered after
a precise version then the imprecise version should be favored (because it will
always trigger an update).

This needs to understand, however, that if `cargo update --precise` is used that
those sources should not be updated again, so the sources treated in
`add_sources` are considered "special" in the sense that they're locked and will
not be updated again.
In general the semantics of a yank are that the code itself is not removed from
the registry, but rather packages are no longer allowed to depend on the
version. A yank is normally done to remove broken code or perhaps secrets, but
actually deleting code means that all packages depending on the yanked version
all of a sudden break. For these reasons a yank does not actually delete code,
but only flags the version as yanked in the index.

Yanked packages are therefore able to be depended upon if a lockfile points at a
yanked version, but are not allowed to become new dependencies of packages.

Implementation-wise, the following changes were made:

* SourceIds originating from a lockfile for registries will have a precise
  version listed (just a generic string "locked")
* Dependencies which use precise source ids are allowed to read yanked versions
* Dependencies without a precise source id are not allowed to use yanked
  versions

When using a lockfile (or a previous instance of resolve), all operations will
rewrite dependencies to have the precise source ids where applicable, meaning
the locked versions have access to yanked versions, but the unlocked versions do
not.
This should help avoid hitting disk wherever possible and will also be leveraged
in just a second!
This logic is based off the precision of the registry source's id as well as the
dependencies being passed in. More info can be found in the comments.
@alexcrichton
Copy link
Member Author

I discussed quite a bit of this with @wycats while I was working on it, so I'm going to go ahead and land this.

bors added a commit that referenced this pull request Oct 27, 2014
This PR contains a laundry list of improvements when using the registry as a source, and should be one of the last steps necessary for whipping cargo into shape to using the registry. Some of the highlights include:

* All APIs have been updated to the registry's current interface
* Lockfiles are now respected with registry sources
  * Conservatively updating dependencies should work
  * The network shouldn't be touched unless absolutely necessary
  * Lockfiles actually keep versions locked when using a newer registry with more versions
* A new standalone lib was added for interoperating with the registry (HTTP-request-wise)
* Packages are now verified before being published to the registry (this can be opted out of)
* `cargo upload` was renamed to `cargo publish`

The commit series is intended to be individually reviewable and each commit should compile and pass all tests independently (but still needs to be applied in order).
@bors bors merged commit f8acf4e into rust-lang:master Oct 27, 2014
@alexcrichton alexcrichton deleted the registry-fixes branch October 28, 2014 00:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants