Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add checksum validation of downloaded archives #205

Open
ppetr opened this issue Jul 12, 2022 · 9 comments
Open

Add checksum validation of downloaded archives #205

ppetr opened this issue Jul 12, 2022 · 9 comments

Comments

@ppetr
Copy link

ppetr commented Jul 12, 2022

While downloading from GitHub via HTTPS gives a reasonable level of security, I'd still prefer to have the binaries verified against their respective checksum files.

I propose to add a field with a URL to a checksum file together with a checksum of the file itself. Example:

  foo_binary:
    fetch:
      url: https://github.com/...
    checksums:
      type: sha256  # Hash type used in checksums.txt as well as in `hash` below.
      url: https://github.com/foo/bar/releases/.../checksums.txt
      hash: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855

After downloading an archive for foo_binary, binenv would also download the checksums.txt file, verify its integrity, and then verify the integrity of the archive against the appropriate checksum in checksums.txt.

An alternative would be to include all the hashes in distributions.yaml itself, but I think it'd be way too verbose.

I'm happy to contribute a PR once an agreement is reached on the details.

@leucos
Copy link
Contributor

leucos commented Jul 13, 2022

Hello @ppetr

That would be great.

However I've left this aside for now because I fear there will be a lot of nitty gritty details (checksums for compressed binaries, checksums inside tarballs, non standard checksum file formats, ...).

If you want to tackle this please go ahead. The above proposal seems fine to me. Keep us informed !

@ppetr
Copy link
Author

ppetr commented Jul 13, 2022

Great! Let's keep it simple, just to verify checksums of tarballs as they're already provided. Later we can think of expanding it, if needed.

I'll then start working on a prototype and I'll keep you updated 🙂.

@ppetr
Copy link
Author

ppetr commented Jul 17, 2022

After some experimenting I came to the following ideas, which can be implemented relatively independently.

Provide a checksum of a checksum file for each released version. This is the simplest solution and probably easiest to work with for authors/maintainers, but it's a bit more verbose in the distribution file.

fzf:
  description: fzf is a general-purpose command-line fuzzy finder.
  url: https://github.com/junegunn/fzf/
  list:
    type: github-releases
    url: https://api.github.com/repos/junegunn/fzf/releases
  fetch:
    url: https://github.com/junegunn/fzf/releases/download/{{ .Version }}/fzf-{{ .Version }}-{{ .OS }}_{{ .Arch }}.tar.gz
  integrity:
    url: https://github.com/junegunn/fzf/releases/download/{{ .Version }}/fzf_{{ .Version }}_checksums.txt
    checksums:
      - url: https://github.com/junegunn/fzf/releases/download/0.30.0/fzf_0.30.0_checksums.txt
        type: sha256
        checksum: 43cc37783e0bf4ed775109379b3e2073ea2bb29c9e4811d07907c868435e1b7e
      # Other versions follow.
  install:
    type: tgz
    binaries:
      - fzf

Use OpenPGP to sign chechsum files. This is often done by authors that already use PGP.

A random example: https://github.com/orgrim/pg_back/releases/tag/v2.1.0. The release includes the checksums.txt and its signature checksums.txt.asc. Then it'd be enough to provide the public key(s) of the author(s) once and use it to verify any of their releases:

  integrity:
    url: https://github.com/junegunn/fzf/releases/download/{{ .Version }}/fzf_{{ .Version }}_checksums.txt.sig
    public_key:
      # This would require https://gopenpgp.org/, probably a more heavy-weight library.
      openpgp:
        - |
          -----BEGIN PGP PUBLIC KEY BLOCK -----
          ...

@ppetr
Copy link
Author

ppetr commented Jul 17, 2022

So my questions are:

  1. Are these options (or one of them) reasonable to implement?

  2. For OpenPGP is it acceptable to add this non-trivial dependency? We could also resort to calling gpg externally, but that feels to me a bit against the spirit of binenv which is otherwise very self-contained.

    Another interesting alternative could be saltpack, but its drawback is that it's very new, so its adoption rate would probably be much smaller.

@leucos
Copy link
Contributor

leucos commented Jul 19, 2022

Well, I think I did not understand your initial proposal.

I thought you wanted to grab the checksums (when they existed) from the released artifacts and compare to what has been downloaded by binenv install.

But it seems you'd like to add checksums for all versions in distributions.yaml.

I do not think we should be the custodians of fingerprints, this is too much responsibilities (and also, quite a chore to maintain; try a make e2e in the repo and you'll feel the pain).

So I am not convinced we're heading the right way here.

@ppetr
Copy link
Author

ppetr commented Jul 19, 2022

I see your point.

From reliability perspective, checking against checksum files in releases might help a bit, but I guess modern https is very good already in ensuring reliable transmission. And since files are always compressed, their internal integrity is verified by mechanisms such as CRCs built in decompression algorithms.

My perspective is rather security. Imagine let's say a GitHub account of a very popular binary becomes compromised. An attacker can replace/create a release with corrupted binaries as well as matching hashes. Then thousands of computers will became infected by malware.

I agree that maintaining checksums of individual files is just not maintainable.

Then how about authors' PGP public keys and/or fingerprints? This means adding just one string once for every eligible project that won't change over time (or extremely rarely). This information could be even scraped automatically for example from project README.md files (if present). But with an important feature that they'd never be changed by automation once present. Then:

  • If the author signed releases with the corresponding PGP key, binenv could verify them automatically for all releases without further intervention.
  • If an attacker compromises a GitHub account, they won't be able to sign new/changed releases. And even if they change the PGP fingerprint in the README.md file, binenv automation won't accept the change without manual inspection.

That way a reasonable level of security can be reached and hopefully with little intrusion.

@leucos
Copy link
Contributor

leucos commented Jul 22, 2022

Interesting. Do you have examples of such signed releases ?

@ppetr
Copy link
Author

ppetr commented Jul 24, 2022

Let me give a couple of examples:

The checksums are either given as a single file (usually .txt.asc) which contains both the original checksums as well as the signature, or as a pair of files (usually .txt and .txt.asc), where the latter contains a detached signature of the former. More details can be found here: https://www.gnupg.org/gph/en/manual/x135.html

I also found out that the fingerprint of the signer's key can be extracted from a signature (https://security.stackexchange.com/q/62916/12485). Which means it'd be possible to collect these fingerprints from all projects that contain such a signature without requiring the authors to publish the key somewhere, making the process even more seamless.

@leucos
Copy link
Contributor

leucos commented Jun 26, 2024

I understand and that's really interesting. However there are a lot of devilish details to handle (gpg availability on the system being one of them, fetching hashes, keys, ...).

It is quite a piece to do, and will only apply to a very small subset of distributions.

If someone wants to take this, that's fine with me. But I am not sure that with the time in my hands I can do this right now.

I feel that some kind of autodiscover feature for repo should be written (after having to add and debug almost 300 entries in the distributions file, you really feel the urge for this tool). May be it could handle this and we could adapt the sources struct to handle signatures.

But again, time is lacking on my side, sorry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants