Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add maintenance:repair step for file size of unencrypted files #36220

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

simonspa
Copy link
Contributor

@simonspa simonspa commented Jan 18, 2023

Summary

This resets the unencrypted_size of files with encryption = 0 to 0. As described in the linked issue at length, there seem to be setups that either have or had encryption enabled in the past, where even for unencrypted files the unencrypted_size column shows non-zero values.

These seem to completely confuse the quota calculation and result in partially unusable systems because users might have a huge quota usage reported while the actual used size is just a few MB.

TODO

  • Add tests

Checklist

Simon Spannagel added 2 commits January 18, 2023 15:45
Signed-off-by: Simon Spannagel <simonspa@kth.se>
Signed-off-by: Simon Spannagel <simonspa@kth.se>
Simon Spannagel added 2 commits January 18, 2023 16:01
Signed-off-by: Simon Spannagel <simonspa@kth.se>
Signed-off-by: Simon Spannagel <simonspa@kth.se>
Copy link
Member

@PVince81 PVince81 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 from me, I hope the second reviewer knows more about DB perf

good luck appeasing the bots 😄
thanks a lot!

Co-authored-by: Anna <anna@nextcloud.com>
Signed-off-by: simonspa <1677436+simonspa@users.noreply.github.com>
Signed-off-by: Simon Spannagel <simonspa@kth.se>
@szaimen szaimen added this to the Nextcloud 26 milestone Jan 18, 2023
@szaimen
Copy link
Contributor

szaimen commented Jan 19, 2023

Drone failures look related

@szaimen szaimen added 2. developing Work in progress and removed 3. to review Waiting for reviews labels Jan 19, 2023
Comment on lines 48 to 49
->where($update->expr()->eq('encrypted', "0", IQueryBuilder::PARAM_INT))
->andWhere($update->expr()->neq('unencrypted_size', "0", IQueryBuilder::PARAM_INT));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no index on neither of those columns, so we may want to add at least to the unencrypted size.

I think the most safe way to use such an index then would be:

Suggested change
->where($update->expr()->eq('encrypted', "0", IQueryBuilder::PARAM_INT))
->andWhere($update->expr()->neq('unencrypted_size', "0", IQueryBuilder::PARAM_INT));
->andWhere($update->expr()->gt('unencrypted_size', "0", IQueryBuilder::PARAM_INT));
->andWhere($update->expr()->eq('encrypted', "0", IQueryBuilder::PARAM_INT))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the reasoning:

  • neq may not use the index
  • the order of where statements sometimes also matters to decide if the index is applied

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EXPLAIN of the current query does a full table scan:


explain update oc_filecache set unencrypted_size = 0 where encrypted = 0 and unencrypted_size != 0;
+------+-------------+--------------+-------+---------------+---------+---------+------+------+-------------+
| id   | select_type | table        | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+------+-------------+--------------+-------+---------------+---------+---------+------+------+-------------+
|    1 | SIMPLE      | oc_filecache | index | NULL          | PRIMARY | 8       | NULL | 8479 | Using where |
+------+-------------+--------------+-------+---------------+---------+---------+------+------+-------------+

Copy link
Contributor Author

@simonspa simonspa Jan 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reversal of arguments and using > query affects smaller number of rows:

explain update oc_filecache set unencrypted_size = 0 where unencrypted_size > 0 and encrypted = 0;
+------+-------------+--------------+-------+---------------+---------+---------+------+------+-------------+
| id   | select_type | table        | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+------+-------------+--------------+-------+---------------+---------+---------+------+------+-------------+
|    1 | SIMPLE      | oc_filecache | index | NULL          | PRIMARY | 8       | NULL | 490  | Using where |
+------+-------------+--------------+-------+---------------+---------+---------+------+------+-------------+

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented the suggested change - but adding an index to either of these columns would have to done separately, and probably run manually, right? How would that work together with an automatic repair step?

@simonspa simonspa requested a review from juliusknorr January 29, 2023 14:17
Signed-off-by: Simon Spannagel <simonspa@kth.se>
@simonspa simonspa force-pushed the p-maintenance-unencrypted-size branch from 574181b to 43da5b0 Compare January 29, 2023 14:30
@simonspa
Copy link
Contributor Author

Performance tests is failing with

fatal: remote error: upload-pack: not our ref 03c3817ff132653c794fd04410977952f69fd614
  • is this owing to the fact that this pull comes from my fork rather than a branch within this repo?

@blizzz blizzz mentioned this pull request Feb 1, 2023
@simonspa simonspa requested a review from PVince81 February 6, 2023 19:19
@skjnldsv skjnldsv mentioned this pull request Feb 23, 2023
This was referenced May 3, 2023
@blizzz blizzz mentioned this pull request May 17, 2023
@blizzz blizzz modified the milestones: Nextcloud 27, Nextcloud 28 May 23, 2023
@skjnldsv skjnldsv mentioned this pull request Nov 1, 2023
This was referenced Nov 6, 2023
This was referenced Nov 14, 2023
@blizzz blizzz modified the milestones: Nextcloud 28, Nextcloud 29 Nov 23, 2023
This was referenced Mar 12, 2024
This was referenced Mar 20, 2024
@skjnldsv skjnldsv mentioned this pull request Mar 28, 2024
81 tasks
@skjnldsv skjnldsv modified the milestones: Nextcloud 29, Nextcloud 30 Mar 28, 2024
This was referenced Jul 30, 2024
This was referenced Aug 5, 2024
@skjnldsv skjnldsv mentioned this pull request Aug 13, 2024
@skjnldsv skjnldsv closed this Aug 14, 2024
@skjnldsv skjnldsv removed this from the Nextcloud 30 milestone Aug 14, 2024
@sorbaugh
Copy link
Contributor

@simonspa this PR looks nice :) I'm reopening in the hopes that you're still interested in pushing this forward! Since you're a member of Nextcloud, could you recreate this PR in a branch directly in the nextcoud/server repo? Feel free and encouraged to check the Nextcloud Commit guidelines. Hope we can keep this cool improvement alive! 🚀

@sorbaugh sorbaugh reopened this Aug 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants