-
-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add maintenance:repair step for file size of unencrypted files #36220
base: master
Are you sure you want to change the base?
Add maintenance:repair step for file size of unencrypted files #36220
Conversation
Signed-off-by: Simon Spannagel <simonspa@kth.se>
Signed-off-by: Simon Spannagel <simonspa@kth.se>
Signed-off-by: Simon Spannagel <simonspa@kth.se>
Signed-off-by: Simon Spannagel <simonspa@kth.se>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 from me, I hope the second reviewer knows more about DB perf
good luck appeasing the bots 😄
thanks a lot!
Co-authored-by: Anna <anna@nextcloud.com> Signed-off-by: simonspa <1677436+simonspa@users.noreply.github.com>
Drone failures look related |
->where($update->expr()->eq('encrypted', "0", IQueryBuilder::PARAM_INT)) | ||
->andWhere($update->expr()->neq('unencrypted_size', "0", IQueryBuilder::PARAM_INT)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no index on neither of those columns, so we may want to add at least to the unencrypted size.
I think the most safe way to use such an index then would be:
->where($update->expr()->eq('encrypted', "0", IQueryBuilder::PARAM_INT)) | |
->andWhere($update->expr()->neq('unencrypted_size', "0", IQueryBuilder::PARAM_INT)); | |
->andWhere($update->expr()->gt('unencrypted_size', "0", IQueryBuilder::PARAM_INT)); | |
->andWhere($update->expr()->eq('encrypted', "0", IQueryBuilder::PARAM_INT)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the reasoning:
- neq may not use the index
- the order of where statements sometimes also matters to decide if the index is applied
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EXPLAIN of the current query does a full table scan:
explain update oc_filecache set unencrypted_size = 0 where encrypted = 0 and unencrypted_size != 0;
+------+-------------+--------------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+--------------+-------+---------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | oc_filecache | index | NULL | PRIMARY | 8 | NULL | 8479 | Using where |
+------+-------------+--------------+-------+---------------+---------+---------+------+------+-------------+
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reversal of arguments and using >
query affects smaller number of rows:
explain update oc_filecache set unencrypted_size = 0 where unencrypted_size > 0 and encrypted = 0;
+------+-------------+--------------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+--------------+-------+---------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | oc_filecache | index | NULL | PRIMARY | 8 | NULL | 490 | Using where |
+------+-------------+--------------+-------+---------------+---------+---------+------+------+-------------+
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented the suggested change - but adding an index to either of these columns would have to done separately, and probably run manually, right? How would that work together with an automatic repair step?
Signed-off-by: Simon Spannagel <simonspa@kth.se>
574181b
to
43da5b0
Compare
Performance tests is failing with
|
@simonspa this PR looks nice :) I'm reopening in the hopes that you're still interested in pushing this forward! Since you're a member of Nextcloud, could you recreate this PR in a branch directly in the nextcoud/server repo? Feel free and encouraged to check the Nextcloud Commit guidelines. Hope we can keep this cool improvement alive! 🚀 |
Summary
This resets the
unencrypted_size
of files withencryption = 0
to0
. As described in the linked issue at length, there seem to be setups that either have or had encryption enabled in the past, where even for unencrypted files theunencrypted_size
column shows non-zero values.These seem to completely confuse the quota calculation and result in partially unusable systems because users might have a huge quota usage reported while the actual used size is just a few MB.
TODO
Checklist