Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetching cargo registry tokens from external processes #2730

Merged
merged 8 commits into from
Dec 4, 2020

Conversation

pietroalbini
Copy link
Member

@pietroalbini pietroalbini commented Jul 22, 2019

Rendered

This RFC proposes a new cargo configuration key to fetch registry tokens from external processes, allowing to store them in password managers or other secret storage systems:

[registry]
credential-process = ["bash", "-c", "pass tokens/crates-io | head -n 1"]

Thanks @Eh2406 for the feedback on the draft!

@pietroalbini pietroalbini added A-security Security related proposals & ideas T-cargo Relevant to the Cargo team, which will review and decide on the RFC. A-registry Proposals relating to the registries of cargo. labels Jul 22, 2019
@llebout
Copy link

llebout commented Jul 22, 2019

Cool, please do add this! Unifying everything in GNOME keyrings and the likes is really nice.

@pietroalbini pietroalbini force-pushed the cargo-token-process branch from a1ed3d7 to 0471313 Compare July 22, 2019 19:03
@kornelski
Copy link
Contributor

macOS Keychain has access controls per application, and enforces that by verifying signature of the calling application.

The problem is, if you run a subprocess that uses Keychain, the Keychain will use identity of the subprocess. That breaks the chain of trust of the Keychain, because this way any application can just call the subprocess to get the token.

So regardless of this method, for macOS I'd prefer the solution mentioned in the alternatives section.

@pietroalbini
Copy link
Member Author

The problem is, if you run a subprocess that uses Keychain, the Keychain will use identity of the subprocess. That breaks the chain of trust of the Keychain, because this way any application can just call the subprocess to get the token.

Didn't know about that, but still, this PR would be an improvement because the users will be prompted every time the token is needed (if they don't "always allow"), providing a small layer of protection. Does Keychain require binaries to be signed to identify the requesting process? In that case, an RFC adding explicit Keychain support would also need to figure out the signing on our infra.

@tarcieri
Copy link

@kornelski I think it'd be great if cargo had built-in OS specific backends for secret storage on operating systems that have some sort of keychain abstraction.

However, that seems like an orthogonal nice-to-have, and there are several places I could see using something like this particular PR. One is "cloud shell" environments where tokens can be encrypted using the cloud's KMS service, and an OAuth credential used to talk to the KMS to decrypt tokens prior to use (logging each usage to an audit log).

As a sidebar and not-too-serious suggestion, there are also ways to circumvent the Keychain Services identity delegation-to-a-subprocess problem with megahax (instead of spawning a subprocess, you can load it into memory with NSCreateObjectFileImageFromMemory and link it into the running executable with NSLinkModule)

@kornelski
Copy link
Contributor

@pietroalbini That "allow/always allow" window pops up on unsigned executables or when the signature doesn't match.
Intended use of the Keychain does require having binaries signed. I know it's an open issue, and I'm hoping it'll get resolved too.

@tarcieri I get usefulness of having easy extensibility. I just wanted to point out that for proper macOS integration it's not entirely sufficient.

@tarcieri
Copy link

@kornelski as a separate issue, adding Keychain Services-backed secret storage to cargo (and abstractions for multiple token providers) is something I'd be interested in helping work on.

@pietroalbini
Copy link
Member Author

pietroalbini commented Jul 24, 2019

Renamed the configuration entry to token-from-process, as it's a bit more clear and the pattern is nicer for future extensions (like token-from-keychain = true).

@lukaslueg
Copy link

As far as I can see we are not required to promise that cargo will execute the subprocess in a shell; the user can specify to execute /bin/sh -c ... by himself if he does, in fact, need shell functionality (e.g. piping to awk as in the given example).

I recommend making explicit in the RFC whether the subprocess will inherit cargo's stdin or not. A working stdin allows the subprocess to ask for a password.

Otherwise an error message will be shown to the user, along with the standard output.

That is, the subprocesses' stdout is shown to the user? Shouldn't stdout and stderr simply be inherited so the subprocess can - if possible - print something to the tune of "please put your 2fa-thongy into the usb-thingy" while waiting for input.

@pietroalbini
Copy link
Member Author

As far as I can see we are not required to promise that cargo will execute the subprocess in a shell; the user can specify to execute /bin/sh -c ... by himself if he does, in fact, need shell functionality (e.g. piping to awk as in the given example).

How do we call the command without running it through a shell though? std::process::Command requires the args to be splitted, so we either implement splitting in Cargo (which doesn't really make sense), require something like token-from-process = ["program", "arg1", "arg2"] which is not that ergonomic or defer to the shell.

I recommend making explicit in the RFC whether the subprocess will inherit cargo's stdin or not. A working stdin allows the subprocess to ask for a password.

That is, the subprocesses' stdout is shown to the user? Shouldn't stdout and stderr simply be inherited so the subprocess can - if possible - print something to the tune of "please put your 2fa-thongy into the usb-thingy" while waiting for input.

Good points! Clarified that stderr and stdin will be inherited, while stdout will be captured to read the token.

@alexcrichton
Copy link
Member

One thing that's worth pointing out is that Cargo already has support for invoking external processes to get credential information, it just only uses it for git authentication. The "credential helper" support that is located in git2 is intended to support Git configuration for external tokens for authentication with git servers.

It'd be great if that basically ended up being the same as whatever Cargo uses natively, and Cargo could be refactored to literally share the same code as well.

One downside of the implementation in git2 is that the deduction of how to invoke the command is pretty arcane and likely buggy to the point of just being downright confusing. I think that's something we'll want to fix/improve on if possible for the Cargo implementation, and I'm not sure if it's as easy as simply saying "pass to sh on Unix and cmd on Windows". That makes documentation about this feature, for example, more difficult because then you have to have conditional configuration depending on the platform you're running on (and cmd really is particularly weird in the syntax it accepts).

@pietroalbini
Copy link
Member Author

One thing that's worth pointing out is that Cargo already has support for invoking external processes to get credential information, it just only uses it for git authentication. The "credential helper" support that is located in git2 is intended to support Git configuration for external tokens for authentication with git servers.

It'd be great if that basically ended up being the same as whatever Cargo uses natively, and Cargo could be refactored to literally share the same code as well.

Do you mean reusing/sharing just the code used to call the process or the whole git credential helper protocol? I feel like the protocol itself is a bit overkill for what we want to do here.

One downside of the implementation in git2 is that the deduction of how to invoke the command is pretty arcane and likely buggy to the point of just being downright confusing. I think that's something we'll want to fix/improve on if possible for the Cargo implementation, and I'm not sure if it's as easy as simply saying "pass to sh on Unix and cmd on Windows". That makes documentation about this feature, for example, more difficult because then you have to have conditional configuration depending on the platform you're running on (and cmd really is particularly weird in the syntax it accepts).

Yeah, git2's approach of calling sh and if it fails splitting by whitespace is not ideal. I don't think using the system's native shell (sh or cmd) is that bad in terms of documentation, as most of the commands we could put in the example are going to either be disqualified for endorsing a specific product or be tied to the host system anyway.

Thinking about it a bit more, @lukaslueg's idea of just skipping the shell could work if we define in the RFC that the command will be split according to the POSIX standard. There is a crate that already implements the splitting code according to that.

@lukaslueg
Copy link

First and foremost this is an educated opinion, but: I would rather avoid having a string that cargo will execute by calling a hoped-for-to-be-there shell and just hand stuff to it.

Splitting the string does not require much code, is easily tested and does not rot. As an alternative, I don't think ["cmd", "arg1"] is that bad, especially if the single-string-syntax is as valid as the list-syntax.

token-from-process = "/usr/bin/helper"
token-from-process = ["/usr/bin/helper", "--quiet"]

@alexcrichton
Copy link
Member

@pietroalbini ah yeah to clarify I mean mostly to bring up what Cargo does today as an example of how Cargo's already doing this in some capacity. If the git protocol makes sense for us we could adopt it pretty easily, but if it doesn't make sense then I don't see any reason we should bend over backwards to do so.

We have, in rustc and cargo, avoided any form of shell escaping in all (afaik) ways of passing multiple arguments. I'd personally like to continue doing so, and the general solution we have is that there's a convenience "give me a bunch of arguments quickly" option which splits on whitespace, and then there's a "give me one argument at a time" interface which does no splitting. (e.g. -Clink-args vs -Clink-arg). Basically what @lukaslueg proposed in terms of TOML configuration is what Cargo already does internally for a number of other configuration values, and I think it would fit nicely here.

It's also worth pointing out that git provides a default built-in credential cache (I think it's like git credential-cache as a subcommand), and it is probably worthwhile to consider that Cargo should grow its own default credential manager. This would serve as a test case for Cargo, a reasonable default (although it should probably still be disabled by default), and something to put in the documentation which is cross-platform.

@ghost
Copy link

ghost commented Jul 30, 2019

For the shell selection, a new attribute might work. ( just thinking outload without knowing too much detail about the subject)
'''
[registry.windows]
default-shell="cmd"
[registry.linux]
defaul-shell="bash"
[registry]
...
'''

This way it has a non hard coded configuration that is separated from the actual parameters and cross platform.
Also this concept can be generalized for other command moving this setting into some "common" cargo config file. Tough it may make things harder to read

@pietroalbini
Copy link
Member Author

Updated the RFC to reflect @lukaslueg's idea.

issued to let the user know about that.

The `token-from-process` key accepts either a string containing the binary to
call or a list containing the binary name and the arguments to provide to it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW Cargo's intepretation of this today is that a bare string is a whitespace-separated list of arguments and a list of strings is the exact list of arguments. We'll probably want to stay consistent with the rest of Cargo in that respect?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My worry about splitting by whitespace is it just breaks in unexpected ways if people wants to use a shell:

token-from-process = 'bash -c "echo foo | head -n 1"'

With some CLI password managers this is going to be really common, and splitting ["bash", "-c", "\"echo", "foo", "|", "head" "-n", "1\""] is just unintuitive. I'd prefer to either properly split quotes (by using crates like shlex) or just not split at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's possible, yes, but I think it's more important to either to stay consistent with Cargo or fully commit to some form of parsing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I'd prefer to just commit to split like a POSIX shell. It's not that complex and there is already an implementation available in the shlex crate.

Ugh, your reply was lost in the notifications... Sorry for the delay responding.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand, yes, but i personally at least feel that using shlex is a local optimum which misses how nothing else in Rust uses shlex. For example -C link-args, RUSTFLAGS, etc. Nothing across our tooling which accepts multiple arguments has ever used shell escapes, and this seems like a pretty odd case to be the first one to use it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would second the objection to shlex or any other parsing.

I'd actually prefer to prohibit "single string with whitespace splitting" and only allow a list of arguments, but if we don't want to do that, then let's be consistent with the rest of Cargo.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joshtriplett so, is the current text of the RFC what you'd like?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sgrif
Copy link
Contributor

sgrif commented Sep 26, 2019

Do you foresee this being something we would want to have interact with 2fa? e.g. if the process that generates your TOTP token were on the same machine

@pietroalbini
Copy link
Member Author

Do you foresee this being something we would want to have interact with 2fa? e.g. if the process that generates your TOTP token were on the same machine

I'd love to have something like this for an eventual 2FA support in Cargo as well (I store my TOTP tokens on my YubiKey), but I think it's outside the scope of this RFC as there is no need for Cargo to request TOTPs at the moment.

@sgrif
Copy link
Contributor

sgrif commented Sep 26, 2019

Sounds fine. Might be worth mentioning it under "Future Possibilities" or at least calling out that this doesn't introduce any incompatibilities there since it's the most likely addition to happen for auth in the near future

@pietroalbini
Copy link
Member Author

Done!

@ehuss
Copy link
Contributor

ehuss commented Feb 5, 2020

Hey @pietroalbini, sorry it took so long to follow up on this.

I was wondering if we could add an "action" argument to the command which would pass "get", "store", or "erase" as an argument? It would just be added as the last argument to the command. It would have similar semantics as other tools. I believe that is how Docker, git, and a few other tools work. This would allow cargo login to use the store action to properly insert the token into the user's keychain, which may not be a particularly pleasant process to do manually. If we ever add cargo logout we could use erase to handle that. WDYT?

@pietroalbini
Copy link
Member Author

@ehuss I see merits in both approaches:

  • token-from-process, proposed by this RFC, is easier to setup as it just calls the program you choose and treats stdout as the password. There is also no requirement for a wrapper script to be created, unlike the other proposal.
  • A "credentials helper" approach would require custom wrappers that glue together Cargo and the password manager you use, and you would need to either write them from scratch or download them from the Internet. If we end up this route we shouldn't create our own protocol but just try to reuse an existing one, to get a production-ready suite of helpers from the start.

Personally I'd prefer the first approach (as it would integrate better with my personal tooling), but I don't see why the two of them couldn't coexist?

@alexcrichton
Copy link
Member

It seems to me like it'd be fine to have a wrapper script to adapt the "cargo protocol" to a password manager's protocol. I think we'll want the flexibility in the future as well to tweak this over time with new capabilities rather than having to reinvent something new each time.

@ehuss
Copy link
Contributor

ehuss commented Sep 2, 2020

I have update the RFC with the extension of optionally allowing a credential process to take an "action" that should make it more user-friendly. Specifically:

  • Changes token-from-process to credential-process. This is just some bike-shedding, to choose something that is more generic (in case we add non-token auth in the future), and uses a term (credential) that is more common.

  • Adds an {action} command-line argument which gets replaced with store, get, and erase. This allows Cargo to manage adding/removing tokens from the credential storage, to make it easier to use. To compromise, if the {action} argument isn't passed, it behaves as before where it is only used for fetching the token.

    • Added some other command-line arguments like {name} for convenience.
  • Added CARGO_REGISTRY_API_URL environment variable (for convenience, possibly for future enhancement, and to provide some context in the credential store).

  • A variety of smaller changes and tweaks.

@ehuss
Copy link
Contributor

ehuss commented Sep 23, 2020

I'm going to propose to merge this. I'm happy with the proposed design, and there hasn't otherwise been much more feedback. I think it will be useful to move forward to an implementation to get more scrutiny on it, and overall I think it could be a very useful feature.

@rfcbot fcp merge

@rfcbot
Copy link
Collaborator

rfcbot commented Sep 23, 2020

Team member @ehuss has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. labels Sep 23, 2020
@Eh2406

This comment has been minimized.

@rfcbot rfcbot added final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. and removed proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. labels Sep 25, 2020
@rfcbot
Copy link
Collaborator

rfcbot commented Sep 25, 2020

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this RFC. to-announce and removed final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. labels Oct 5, 2020
@rfcbot
Copy link
Collaborator

rfcbot commented Oct 5, 2020

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

The RFC will be merged soon.

@ehuss ehuss merged commit ed6cce4 into rust-lang:master Dec 4, 2020
@ehuss
Copy link
Contributor

ehuss commented Dec 4, 2020

Huzzah! The @rust-lang/cargo team has decided to accept this RFC.

To track further discussion, subscribe to the tracking issue here:
rust-lang/cargo#8933

@pietroalbini pietroalbini deleted the cargo-token-process branch December 4, 2020 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-registry Proposals relating to the registries of cargo. A-security Security related proposals & ideas disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this RFC. T-cargo Relevant to the Cargo team, which will review and decide on the RFC. to-announce
Projects
No open projects
Status: Done (Stabilized)
Development

Successfully merging this pull request may close these issues.