Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved API Tokens #32

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open

Improved API Tokens #32

wants to merge 16 commits into from

Conversation

mdtro
Copy link
Member

@mdtro mdtro commented Oct 27, 2022

Improved API tokens to provide customers an easier means of detecting accidentally leaked secrets by integrating with Github's Secret Scanning Service and other static analysis tools.

Rendered RFC

text/0032-improved-api-tokens.md Outdated Show resolved Hide resolved
text/0032-improved-api-tokens.md Outdated Show resolved Hide resolved
text/0032-improved-api-tokens.md Outdated Show resolved Hide resolved
text/0032-improved-api-tokens.md Outdated Show resolved Hide resolved
text/0032-improved-api-tokens.md Outdated Show resolved Hide resolved
@misha-sentry
Copy link

a few questions:

  • "Verify the legacy token is valid/invalid" - are you just checking if its expired or not?
  • For legacy tokens, you are proposing to store it in a hashed format, and will you also have it prepended with the token type identifier?
  • What backwards compatibility issues need to be considered for making changes to the frontend (ie. not displaying the token after the one time after it is created)?

@mdtro
Copy link
Member Author

mdtro commented Oct 27, 2022

@misha-sentry

"Verify the legacy token is valid/invalid" - are you just checking if its expired or not?

This would be our standard verification. For example, does the token exist and is it not expired?

For legacy tokens, you are proposing to store it in a hashed format, and will you also have it prepended with the token type identifier?

Only the hashed format. The plaintext token value that would contain the prefix would not exist in the database.

What backwards compatibility issues need to be considered for making changes to the frontend (ie. not displaying the token after the one time after it is created)?

We deploy frontend and backend separately. Deploying both at the same time is high risk since rollouts happen gradually.
Instead, we'll have to figure out a path where they can each be deployed separately without breaking each other. I'll need to partner with the engineering teams to figure that one out.

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please feel free to ack and resolve any of my ideas that have more cost than benefit.
Similarly if they're founded on my own ignorance or misapprehension :)

text/0032-improved-api-tokens.md Show resolved Hide resolved
text/0032-improved-api-tokens.md Outdated Show resolved Hide resolved

We need a predictable token format in order to integrate properly with secret scanning services. Our current format is a 64 character alphanumeric string. This is insufficient and would likely produce a high amount of false positives in tooling like [TruffleHog](https://github.com/trufflesecurity/trufflehog), [detect-secrets](https://github.com/Yelp/detect-secrets), [Github's Secret Scanning](https://docs.github.com/en/code-security/secret-scanning/about-secret-scanning), etc.

The suggested pattern is `snty[a-zA-Z]_[a-zA-Z0-9]{64}`. The character _after_ `snty` will be used to identify the token type.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ergonomics would benefit greatly from some internal hyphenation.
Since your bitfield is twice the size of uuid, a doubled uuid scheme would make sense, to me:

`snX-fe611dff-98e0-423a-aa9a-d1f629b0f869-05b934cf-61d8-4c1e-90ae-1d8ff290cca7ty'

The static suffix provides less (but nonzero) benefit in this case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't expect most would be typing this key in manually, but rather a copy/paste?

I don't really hold a strong opinion here on the hyphens. Arguably, if most are copying and pasting the value the impact won't be felt, so it might just be a win-win here?

text/0032-improved-api-tokens.md Show resolved Hide resolved
text/0032-improved-api-tokens.md Outdated Show resolved Hide resolved
text/0032-improved-api-tokens.md Outdated Show resolved Hide resolved
text/0032-improved-api-tokens.md Outdated Show resolved Hide resolved
text/0032-improved-api-tokens.md Outdated Show resolved Hide resolved
Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this RFC actually contains two ideas that could be implemented separately:

  • keys at rest should be hashed -- a backend-only change
  • a new format for displaying and receiving secrets -- with care this can also be backend-only: API responses will only use the new form, while requests can optionally use the old during a (relatively lengthy) transition period.

I would likely want to separate these projects so that they can proceed each at their own pace.

@mitsuhiko
Copy link
Member

@markstory this is largely unrelated to this RFC but I wonder if there would be some benefit to encoding the silo into the token. That way we could route based on the information in the token from the central silo. Since we are now talking about re-issuing tokens anyways we might take that opportunity.

@markstory
Copy link
Member

I wonder if there would be some benefit to encoding the silo into the token. That way we could route based on the information in the token from the central silo.

The ApiToken models we currently have don't include any organization links which makes encoding a region into them hard as the tokens would work for any organization the user has access to.

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved: I'm not qualified to approve the RFC itself, but this PR is for a draft-status rfc.

For example:

- `sntyu_a1b2c3...` - identifies a **user** API token
- `sntya_a1b2c3...` - identifies an **API Application** token
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have sntrys_ prefix for org auth tokens. I think we can use the same prefix and derive the type of token from the payload (aka "facts").

Copy link
Member Author

@mdtro mdtro Sep 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sentrys_ prefix is for structured tokens. We probably should have gone with o or something to indicate an org auth token.

I think it'll be a bit before we we can derive the token type from the facts within the payload. For example, there isn't a field in the current org auth token that indicates the token type.

I think it's important to keep these separate prefixes especially when it comes to building out our service that integrates with GitHub's secret scanning program. With them, we do not need to decode a token to determine it's type, avoiding the extra processing. We can look at the first characters, determine the token type, perform a quick hashing operation, and then check for matches.


**In the first release:**

1. The frontend is updated to no longer display the token value for existing tokens.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will also need to deprecate /api/0/api-tokens/ API endpoint with some notice period.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, or at least update it to no longer return the full token value when just listing them (post token creation). I'll update the RFC.

mdtro added a commit to getsentry/sentry that referenced this pull request Jan 12, 2024
Take two of #59455.

- Add an option to toggle on/off automatically populating the
`token_last_characters` for the `ApiToken` model.

This option will ensure tokens created from here on have the field
populated. It will also allow me to thoroughly test the backfill
migration needed for existing tokens that will be coming in a future PR
by disabling the option in tests, creating a bunch of API tokens, and
then running the backfill migration test.

Tracking Issue: #58918
RFC: getsentry/rfcs#32
mdtro added a commit to getsentry/sentry that referenced this pull request Jan 17, 2024
Backfills the `token_last_characters` column on existing `ApiToken` entries.

As of #62972, new `ApiToken` entries
are being created with this column already populated with the correct
values. This backfill will only impact old tokens created prior.

Tracking Issue: #58918
Related RFC: getsentry/rfcs#32
trillville pushed a commit to getsentry/sentry that referenced this pull request Jan 19, 2024
Take two of #59455.

- Add an option to toggle on/off automatically populating the
`token_last_characters` for the `ApiToken` model.

This option will ensure tokens created from here on have the field
populated. It will also allow me to thoroughly test the backfill
migration needed for existing tokens that will be coming in a future PR
by disabling the option in tests, creating a bunch of API tokens, and
then running the backfill migration test.

Tracking Issue: #58918
RFC: getsentry/rfcs#32
trillville pushed a commit to getsentry/sentry that referenced this pull request Jan 19, 2024
Backfills the `token_last_characters` column on existing `ApiToken` entries.

As of #62972, new `ApiToken` entries
are being created with this column already populated with the correct
values. This backfill will only impact old tokens created prior.

Tracking Issue: #58918
Related RFC: getsentry/rfcs#32
@mdtro mdtro marked this pull request as draft January 31, 2024 22:15
@mdtro mdtro marked this pull request as ready for review January 31, 2024 22:15
mdtro added a commit to getsentry/sentry that referenced this pull request Feb 2, 2024
Adds backend support for naming API tokens.

- see issue #9600 and
getsentry/customer-feedback#22
- #58918
- [RFC #32](getsentry/rfcs#32)
cmanallen pushed a commit to getsentry/sentry that referenced this pull request Feb 7, 2024
mdtro added a commit to getsentry/sentry that referenced this pull request Feb 16, 2024
In support of our improved API tokens initiative
(getsentry/rfcs#32), this PR adds two null-able
columns to the `ApiToken` model that will hold the hash values.

Hashed values for token values will be implemented over several PRs to
maintain backwards compatibility and to prevent broken state between
versions.
mdtro added a commit to getsentry/sentry that referenced this pull request Feb 23, 2024
In support of getsentry/rfcs#32. 

Add a nullable `token_type` column to the `ApiToken` model. This will be
used to help us identify the different kinds of API tokens we have in
the application via a prefix. With this, we'll be able to integrate with
GitHub and others' secret scanning program to prevent token leaks.
Legacy (e.g. tokens that already exist) will have a null value here, so
we'll know they are not one of our new tokens with the prefix format
once all tokens are stored solely as hashed values.
mdtro added a commit to getsentry/sentry that referenced this pull request Feb 26, 2024
In support of getsentry/rfcs#32. 

API Tokens will be prefixed with seven extra characters (ex. `sntryu_`).
Eventually these plaintext `token` columns will be dropped, but to
maintain backwards compatibility and a smooth transition to fully hashed
tokens we'll still want to store them here.
mdtro added a commit to getsentry/sentry that referenced this pull request Apr 17, 2024
Supports getsentry/rfcs#32. 

- Newly created _user auth tokens_ will be prefixed with `sntryu_`. 
- Introduce a custom model manager for `ApiToken` to handle the unique
creation logic where we need to hash the token values and store them.
- Use the `token_type` (currently optional) parameter/field on
`ApiToken` when creating user auth tokens to let the model do the heavy
lifting on generating and hashing the values. This will keep the logic
out of views and simplify calls to just `new_token =
ApiToken.objects.create(user=user, token_type=AuthTokenType.USER)`.
- I've [changed some
behavior](https://github.com/getsentry/sentry/pull/68148/files#diff-e68bf726258b71dbfe6c6a3dcb9a959faba4e9585762067078a38bed5bad4812R36-R37)
where we only return the `refreshToken` on applicable token types.

I introduce a "read once" pattern in this PR for the token secrets to
prevent leaking of them in logs, exceptions, etc. It works like this
when creating a new `ApiToken`:

1. The model manager sets temporary fields `__plaintext_token` and
`__plaintext_refresh_token` that store the respective plaintext values
for temporary reading.
2. When reading the value through the `plaintext_token` property on
`ApiToken` (notice the single prepended underscore) the string value is
returned and `__plaintext_token` is immediately set to `None`.
3. If you attempt to read the `_plaintext_token` property again, an
exception will be raised, `PlaintextSecretAlreadyRead`.
mdtro added a commit to getsentry/sentry that referenced this pull request Jun 10, 2024
Supports getsentry/rfcs#32

We've been hashing tokens as they are used to authenticate
(#65941), but it's started to
level out. This is a backfill migration to fill in all of the hashed
values for the remaining tokens.

Huge thank you to @markstory @wedamija and @GabeVillalobos for helping
with the migration test! 🙏
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants