Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(filecache): Move from array_merge to avoid memory exhaustion on large scans #43982

Merged
merged 1 commit into from
Mar 7, 2024

Conversation

solracsf
Copy link
Member

@solracsf solracsf commented Mar 4, 2024

Summary

When merging large arrays, PHP may require a significant amount of memory, especially if the arrays contain a lot of data.
This can be the case in file scans of large filesystems.

To avoid using array_merge and reduce memory consumption, concatenate the arrays using a loop instead.

Checklist

Signed-off-by: Git'Fellow <12234510+solracsf@users.noreply.github.com>
@solracsf solracsf added the 3. to review Waiting for reviews label Mar 4, 2024
@solracsf solracsf added this to the Nextcloud 29 milestone Mar 4, 2024
@solracsf solracsf requested review from a team, ArtificialOwl, icewind1991, nfebe, ChristophWurst and blizzz and removed request for a team March 4, 2024 17:17
@icewind1991
Copy link
Member

do you know if $array += $other has the same memory usage as array_merge?

@solracsf
Copy link
Member Author

solracsf commented Mar 4, 2024

do you know if $array += $other has the same memory usage as array_merge?

Never tested such approach. In general, foreach gives good memory control and very little performance penalties so it's widelly adopted (for very large arrays).

EDIT: @icewind1991 i've tested with a simple script, and indeed it uses less memory than array_merge!

12683KB for $array += $other and 20879KB for array_merge for the same numeric data set (basically 2 arrays of 250K numbers).
So almost 40% less memory usage in a small test! Good spot :) (tested at https://onlinephp.io/ with PHP 8.2.12)

But i'm not familiar with the result of the merge in this case (preservation of keys, associative and indexed arrays...).
If you believe it's a better approach in this specific case, please let me know.

@icewind1991
Copy link
Member

I'm not actually sure if += works here since it overwrites items with the same (implicit) keys

@skjnldsv skjnldsv added 4. to release Ready to be released and/or waiting for tests to finish feature: files and removed 3. to review Waiting for reviews labels Mar 7, 2024
@skjnldsv skjnldsv merged commit 7dd22b0 into master Mar 7, 2024
158 of 160 checks passed
@skjnldsv skjnldsv deleted the fixMemExaust branch March 7, 2024 21:53
@Altahrim Altahrim mentioned this pull request Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4. to release Ready to be released and/or waiting for tests to finish feature: files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants