Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADR 0007: Storing GPG encrypted secrets. #36

Closed
wants to merge 1 commit into from
Closed

Conversation

rabernat
Copy link
Contributor

This is a potential solution to handling of secrets in Pangeo Forge.

)
```

The `Secret` type will subclass `str` but will load its values from either environment variables or Prefect Secrets.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a potential implementation

import os
import pickle

class Secret(str):
    
    @classmethod
    def _get_secret_from_env(cls, env_var_name):
        try:
            from prefect.client import Secret as PrefectSecret
            return PrefectSecret(env_var_name).get()
        except (ImportError, ValueError):
            return os.environ[env_var_name]
    
    def __new__(cls, env_var_name):
        val = cls._get_secret_from_env(env_var_name)
        return super().__new__(cls, val) 
    
    def __init__(self, env_var_name):
        self._env_var_name = env_var_name
    
    def __reduce__(self):
        return (self.__class__, (self._env_var_name, ))
    
os.environ["FOO"] = "BAR"
s = Secret("FOO")
assert s == "BAR"
assert s._env_var_name == "FOO"
p = pickle.dumps(s)
os.environ["FOO"] = "BAZ"
s_ = pickle.loads(p)
assert s == "BAR"
assert s_ == "BAZ"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this go in pangeo_forge_recipes.storage?


## Context

Recipes often need to pull data from download locations (e.g. FTP servers) that require authentication with secres (e.g. username / password).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sp: secres -> secrets


## Consequences

- Recipes will no longer store plain-text passwords.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we introduce some kind of check at CI time to flag up P-text passwords to discourage their use?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good idea, but I personally wouldn't know how to write it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good idea, but I personally wouldn't know how to write it.

FilePattern.query_string_secrets and FilePattern.fsspec_open_kwargs are both passed as dicts.

query_string_secrets are always secret, so we could check:

for v in recipe.file_pattern.query_string_secrets.values():
    assert isinstance(v, Secret)

Only certain key:value pairs of fsspec_open_kwargs are secrets, so maybe:

for k, v in recipe.file_pattern.fsspec_open_kwargs.items():
    if k in ["username", "password"]:
        assert isinstance(v, Secret)

## Consequences

- Recipes will no longer store plain-text passwords.
- We will have to create and publish these key pairs.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And presumably rotate them? We probably ought to define a key rotation schedule.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! That's the sort of suggestion I was looking for. What would be a good rotation frequency?

Here's an article about how to rotate your gpg keys.


- Recipes will no longer store plain-text passwords.
- We will have to create and publish these key pairs.
- We will have to test whether we can correctly pass these secrets through to the bakeries.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What issues do you forsee here?

Copy link
Contributor Author

@rabernat rabernat Sep 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mainly prefect stuff. We need to correctly get the secrets from the encrypted file, pass them to prefect cloud, have the bakeries access them. Seems like lots could go wrong. But we won't know until we try.

Copy link
Member

@cisaacstern cisaacstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe I'm especially qualified to comment on the security implications of this ADR, but the implementation itself is clean and easy-to-understand.

For the most part, the credentials under consideration here should be one-off logins for specific data servers/services, so we might just want to add a reminder in the eventual documentation that users should make sure these credentials are unique and not to reuse login information from their emails, online banking, etc.

@tracetechnical
Copy link

@rabernat I am still considering the above RE: frequency et al. We should dig out and follow industry best practise regarding this.
@cisaacstern I agree, we should link to some documentation (preferably someone else's to save duplication of effort) around good password hygiene and how to make strong passwords. Preferably with a few horror stories to push the point home.

@rabernat
Copy link
Contributor Author

One challenge I see around secret rotation is the fact that encrypted secrets will get checked into the feedstocks once and potentially stay there for years! Should we expect receipe maintainers to periodically re-encrypt their credentials after we rotate keys? Or would it be okay to keep using the old keys to for those old secrets?

@cisaacstern
Copy link
Member

and potentially stay there for years!

This should only become relevant if the feedstock is re-run (i.e., if it's tagged with a new version), correct? In which case we could add a GitHub workflow to the feedstock template repository that checks if the gpg key used is up-to-date? This would require adding a field for gpg key to the meta.yaml, I guess. (Which maybe isn't such a bad idea anyway?) I'm assuming here there is no way to recover a public key directly from an encrypted file

@cisaacstern
Copy link
Member

cisaacstern commented Mar 15, 2022

Just learned about https://github.com/mozilla/sops via @yuvipanda's tweet. Seems like another useful option to consider re: secret handling.

@yuvipanda
Copy link
Contributor

Yeah, sops is amazing and much more user friendly than gpg. Highly recommend. You can use gpg with it too if you like.

@tracetechnical
Copy link

SOPS is 100% where its at. I have deployed this into production at a few places now.

@abarciauskas-bgse
Copy link
Contributor

closing as stale

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants