Remove bespoke throttle exemption #792

sarayourfriend · 2022-07-11T23:21:34Z

Fixes

Unblocks WordPress/openverse-frontend#1558

Description

Removes the throttle exemption classes and switches to just using the scoped OAuth tokens. While that is also bespoke, it already exists and this allows us to delete a substantial amount of hot-path code 🙂

I wrote some documentation to describe the state of the throttling and authentication world. Please read that, then read this section unless you're already familiar with concepts like rate limit models and throttle scopes (and how they're related).

This PR adds a new rate limit model of "exempt" and ties it to a scope with None throttling configuration, which is the same as setting it to infinity. This truly exempts the requester from all rate limiting, just like how the frontend is currently exempting itself via IP address.

In the interest of now having to coordinate a complicated deployment, I've left the IP based throttling in for now.

Before deploying this PR we MUST update the WordPress.com/Jetpack application to have the exempt rate limit model in order to maintain the current behavior of that application.

Testing Instructions

Note that in addition to below I've also added comprehensive unit tests to the throttling module. If you notice anything missing, please let me know so I can add it.

Please also review the documentation I added and let me know if it needs any clarification.
To test this locally, I recommend using the load testing script from this PR: WordPress/openverse#257

Update settings.py to:

Enable throttling
Add your docker host address to ALLOWED_HOSTS. This will either be host.docker.internal on Docker Desktop or 172.17.0.1 otherwise.

Next create an application and token by following the authentication flow on your local API. Open up Django admin to be able to move the application between the different rate limit models.

I would start by first testing anonymous rate limiting by putting a bogus api key in the load testing script. Confirm that you're quickly rate limited.

Next, use your actual key. Fiddle with the api.sh variables of requests and concurrency until you surpass the rate limit. This is testing the "standard" model.

Move the application into the "enhanced" model and do the same. You should be able to overwhelm this throttle by setting the api load testing script to make a little over 200 requests, maybe slightly more due to throttle race conditions. Use 300 and you'll definitely see it rate limit.

Now move the application into the "exempt" model and repeat. You can make the number of requests as high as you want, but you'll never get throttled.

Add your local IP to the ip-whitelist in your local redis. I don't know what the best way to find this IP is, it might be different per computer. I just put a breakpoint in a request and check to see what the IP of the request is from inside the API.

Run just exec cache bash then redis-cli. Add your IP to ip-whitelist by running:

> sadd "ip-whitelist" "<ip>"

Note the quotes are significant here!

Without restarting your API (because the cache is not persistent between restarts on local), make the same anonymous api load testing requests and confirm that you are likewise never throttled.

We will remove this final behavior in a subsequent PR once the frontend has been migrated to use API keys to circumvent the rate limiting.

Checklist

My pull request has a descriptive title (not a vague title like Update index.md).
My pull request targets the default branch of the repository (main) or a parent feature branch.
My commit messages follow best practices.
My code follows the established code style of the repository.
I added or updated tests for the changes I made (if applicable).
I added or updated documentation (if applicable).
I tried running the project locally and verified that there are no visible errors.

Developer Certificate of Origin

Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

github-actions · 2022-07-11T23:22:46Z

API Developer Docs Preview: Ready

https://wordpress.github.io/openverse-api/_preview/792

Please note that GitHub pages takes a little time to deploy newly pushed code, if the links above don't work or you see old versions, wait 5 minutes and try again.

You can check the GitHub pages deployment action list to see the current status of the deployments.

dhruvkb

Looks good to me!

justfile

dhruvkb · 2022-07-19T14:30:06Z

api/docs/reference/api/authentication_and_throttling.md

+
+* `standard`: The default rate limit that anyone is able to unlock simply by authenticating. Slightly higher and much more useful limit for an application.
+* `enhanced`: A selectively granted rate limit that allows for significantly higher burst and sustained requests.
+* `exempt`: A rate limit model that exists purely to accommodate internal infrastructural needs. For example, the server that renders the frontend can use tokens of this tier to prevent itself from being rate limited by the API during server side rendering. Likewise, when we need to circumvent the throttling to do load testing, we use a key of this tier.


These tiers are so well-named.

obulat · 2022-07-19T15:13:42Z

I'm having trouble initializing the API:

ModuleNotFoundError: No module named 'log_request_id'

dhruvkb · 2022-07-19T15:19:15Z

@obulat you need to rebuild the web image. This is required whenever Python packages are changed (which recently have).

$ docker-compose build web

AetherUnbound

Looks good, thanks for the extra documentation! I just have two notes.

api/catalog/api/utils/oauth2_helper.py

api/docs/reference/api/authentication_and_throttling.md

AetherUnbound

I wasn't able to get the testing steps to work locally, even when I set my application to exempt I was still getting throttled with that key 😕 I feel like I must be doing something wrong, but I'm not sure where. I'm confident in the code though, and I know this is currently blocking. If someone else was able to get the testing steps working please let me know!

sarayourfriend · 2022-07-19T23:40:46Z

To verify some things, @AetherUnbound, as it concerns me that you weren't able to get it to work.

Did you run migrations (just dj migrate)? Did you try recreating + init?

I left out the instructions to run the migration from the PR description, sorry if that is the issue!

AetherUnbound · 2022-07-19T23:51:45Z

I did run the migration, but that was after I made the application 🤔 do you think that would have had an effect? I'll try recreate + init!

sarayourfriend · 2022-07-19T23:55:08Z

I'm not sure! It would be a bad sign if it did cause a problem as that would be one of the situations we'll run into when we migrate some of the existing "exempted" applications 🤔

sarayourfriend · 2022-07-19T23:58:29Z

Which reminds me... we will have to be careful when we deploy this PR. We'll have to run the migrations, then update the rate_limit_model for currently exempted applications, then deploy the rest of the PR.

Interesting to note that with automated deployments we won't be able to follow this manual process... in those cases we'd have to split this PR into two:

First a part for the migration;
then, the rest of the PR with the refactor and code removal

We are still months from that scenario but it is something for us to start thinking about now so we're not caught off guard when the time does come.

AetherUnbound · 2022-07-20T00:04:54Z

Ah okay, I got it working! I also had to set the throttled application as "verified". One other thing to note though, the Admin UI required that I set a value for Client type before I could change the exception level. Is that an issue?

sarayourfriend · 2022-07-20T00:10:34Z

It's not an issue, we can just set it to public I guess? Or we could fix it in this PR and include a data migration to update existing applications to have public set explicitly.

I'm not sure why it's able to be empty 🤔 The column is configured like this:

    client_type = models.CharField(max_length=32, choices=CLIENT_TYPES)

https://github.com/jazzband/django-oauth-toolkit/blob/master/oauth2_provider/models.py#L106

Kind of strange, no?

obulat · 2022-07-20T07:10:19Z

What is the correct path for running migrations? I couldn't pass the API healthcheck, so I ran just recreate (I also deleted all the images and cache to be sure). When I try to run just init after that, I get the relation doesn't exist error because the migrations weren't applied. So, I tried recreating again, this time running just dj migrate before running just init. But it still didn't work, the API healthcheck failed.
I did manage to run the migrations using Docker desktop to open the API container shell. Then I got throttled on the healthcheck requests 😆 (because just init was still running and querying that endpoint)

sarayourfriend · 2022-07-20T12:00:07Z

@obulat can you run just logs web and see if there is an error in the API causing the healthcheck to fail?

obulat · 2022-07-20T14:30:05Z

@obulat can you run just logs web and see if there is an error in the API causing the health check to fail?

I can no longer reproduce the API error; it works well after I ran the migrations inside the web shell. But it was something about missing relations in the database.

AetherUnbound · 2022-07-20T22:09:11Z

It's not an issue, we can just set it to public I guess? Or we could fix it in this PR and include a data migration to update existing applications to have public set explicitly.

I'm not sure why it's able to be empty thinking The column is configured like this:
    client_type = models.CharField(max_length=32, choices=CLIENT_TYPES)
https://github.com/jazzband/django-oauth-toolkit/blob/master/oauth2_provider/models.py#L106

Kind of strange, no?

I checked this again on #808 (which is off of main), the exact same thing happened there! So it's nothing related to this branch, and it seems we've been working with that quirk without issue 🙂

sarayourfriend added 3 commits July 11, 2022 18:25

Add attach script

4f862a7

Remove bespoke throttle exemption and use existing custom scoping

bfe0f44

Add documentation to index

5f831d2

sarayourfriend requested a review from a team as a code owner July 11, 2022 23:21

sarayourfriend requested review from obulat and stacimc July 11, 2022 23:21

openverse-bot added 💻 aspect: code Concerns the software code in the repository 🟨 priority: medium Not blocking but should be addressed soon 🧰 goal: internal improvement Improvement that benefits maintainers, not users labels Jul 11, 2022

sarayourfriend added 4 commits July 12, 2022 10:22

Lint

34fb71f

Fix tests and remove leaky logs

046c9a6

Lint

8524b76

Remove expiration exemption

ccd65f7

dhruvkb approved these changes Jul 19, 2022

View reviewed changes

Add comment explaining new recipe

a320b89

AetherUnbound reviewed Jul 19, 2022

View reviewed changes

api/catalog/api/utils/oauth2_helper.py Show resolved Hide resolved

api/docs/reference/api/authentication_and_throttling.md Outdated Show resolved Hide resolved

sarayourfriend added 2 commits July 19, 2022 13:23

Add early return and increase log severity

0e93aec

Add missing link

36c9a87

AetherUnbound approved these changes Jul 19, 2022

View reviewed changes

sarayourfriend merged commit 419082f into main Jul 21, 2022

sarayourfriend deleted the remove/bespoke-throttle-exemption branch July 21, 2022 00:42

sarayourfriend mentioned this pull request Jul 22, 2022

Fix missing rate limit model from rate_limit info endpoint #815

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove bespoke throttle exemption #792

Remove bespoke throttle exemption #792

sarayourfriend commented Jul 11, 2022

github-actions bot commented Jul 11, 2022 •

edited

Loading

dhruvkb left a comment

dhruvkb Jul 19, 2022

obulat commented Jul 19, 2022

dhruvkb commented Jul 19, 2022 •

edited

Loading

AetherUnbound left a comment

AetherUnbound left a comment

sarayourfriend commented Jul 19, 2022

AetherUnbound commented Jul 19, 2022

sarayourfriend commented Jul 19, 2022

sarayourfriend commented Jul 19, 2022

AetherUnbound commented Jul 20, 2022

sarayourfriend commented Jul 20, 2022

obulat commented Jul 20, 2022

sarayourfriend commented Jul 20, 2022

obulat commented Jul 20, 2022

AetherUnbound commented Jul 20, 2022

Remove bespoke throttle exemption #792

Remove bespoke throttle exemption #792

Conversation

sarayourfriend commented Jul 11, 2022

Fixes

Description

Testing Instructions

Checklist

Developer Certificate of Origin

github-actions bot commented Jul 11, 2022 • edited Loading

dhruvkb left a comment

Choose a reason for hiding this comment

dhruvkb Jul 19, 2022

Choose a reason for hiding this comment

obulat commented Jul 19, 2022

dhruvkb commented Jul 19, 2022 • edited Loading

AetherUnbound left a comment

Choose a reason for hiding this comment

AetherUnbound left a comment

Choose a reason for hiding this comment

sarayourfriend commented Jul 19, 2022

AetherUnbound commented Jul 19, 2022

sarayourfriend commented Jul 19, 2022

sarayourfriend commented Jul 19, 2022

AetherUnbound commented Jul 20, 2022

sarayourfriend commented Jul 20, 2022

obulat commented Jul 20, 2022

sarayourfriend commented Jul 20, 2022

obulat commented Jul 20, 2022

AetherUnbound commented Jul 20, 2022

github-actions bot commented Jul 11, 2022 •

edited

Loading

dhruvkb commented Jul 19, 2022 •

edited

Loading