-
-
Notifications
You must be signed in to change notification settings - Fork 795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batch tokens deletion in cleartokens command #969
Batch tokens deletion in cleartokens command #969
Conversation
Codecov Report
@@ Coverage Diff @@
## master #969 +/- ##
==========================================
+ Coverage 96.64% 96.67% +0.03%
==========================================
Files 31 31
Lines 1756 1775 +19
==========================================
+ Hits 1697 1716 +19
Misses 59 59
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unit and integration tests for the changed logics are needed. also need explanation why atomic transacion is not needed here.
I am a bit worried about the performance that this may have on large postrgesql databases because of the .count() and id__in, can we measure and report a run against a big table to see how it performs? |
source: https://docs.djangoproject.com/en/3.2/ref/models/querysets/#when-querysets-are-evaluated Performance is the reason why this PR emerges. I've tried to remove ~1.5M of stale tokens and I've failed because very high RAM usage. When I'd introduced batching it passed without any peaks in resources of machine on postgres installed on very basic AWS instance. Because |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delayed follow up on this PR. It looks good. Please rebase and resolve conflicts. I'm targeting the next minor release for this. Thanks! (I'm afraid to check our production DB to see how many expired tokens are in it;-)
Do not check for merge conflicts in AUTHORS file, because ======= in 2nd line triggers the error.
e4428bc
to
2340bdf
Compare
I had to exclude AUTHORS from check-merge-conflict, because it causes
You can keep this change or change separator in AUTHORS file. |
I think I'll approach this by fixing AUTHORS separately as the problem doesn't "belong" to this PR. Also, since I'm targeting 1.7.0 for this as it's more than a patch, let's hold it off a bit until 1.6.1. gets published. |
pre-commit/pre-commit-hooks#100 describes the problems with RST underline. |
@auvipy no test existed prior to this PR so let's accept it as is and put writing a test for the command on the backlog. |
@@ -101,6 +101,8 @@ | |||
# Whether to re-create OAuthlibCore on every request. | |||
# Should only be required in testing. | |||
"ALWAYS_RELOAD_OAUTHLIB_CORE": False, | |||
"CLEAR_EXPIRED_TOKENS_BATCH_SIZE": 10000, | |||
"CLEAR_EXPIRED_TOKENS_BATCH_INTERVAL": 0.1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@merito Sorry this is after the fact but wouldn't a default value of 0 be best, especially since the sleep is always executed even if the batch is tiny.
https://github.com/merito/django-oauth-toolkit/blob/725c3c9d8927379c9808abd1badb4fcd9ff1cbaa/oauth2_provider/models.py#L636
Fixes #651
Description of the Change
Use batches when deleting expired tokens using
cleartokens
management command. Introduce two setting variables to configure batch size and interval between deletions. The defaults were tested on quite weak machines and ~1.5M deleted tokens. It allows to runcleartokens
without any downtime.Checklist
CHANGELOG.md
updated (only for user relevant changes)AUTHORS