Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large data files left in S3 object storage after bad uploads #29841

Open
blazejhanzel opened this issue Nov 22, 2021 · 13 comments
Open

Large data files left in S3 object storage after bad uploads #29841

blazejhanzel opened this issue Nov 22, 2021 · 13 comments
Labels
0. Needs triage Pending check for reproducibility or if it fits our roadmap feature: object storage hotspot: file transfer performance upload & download performance related optimizations needs review Needs review to determine if still applicable technical debt

Comments

@blazejhanzel
Copy link

How to use GitHub

  • Please use the 👍 reaction to show that you are affected by the same issue.
  • Please don't comment if you have no relevant information to add. It's just extra noise for everyone subscribed to this issue.
  • Subscribe to receive notifications on status change and new comments.

Steps to reproduce

  1. Send files via desktop client or web
  2. Abort sending by internet connection failure (or wait for desktop clients error timeout/mismatch byte count)
  3. Check S3 container content for 500 MB files with binary countent of no sense

Expected behaviour

Server should notice that upload is aborted and should delete this badly uploaded files from object storage.

Actual behaviour

Server leaves objects on object storage (OVH S3) in 500MB files. Cannot clear them using occ files:scan --all and occ files:cleanup.

Server configuration

Operating system: Ubuntu 20.04 LTS

Web server: Apache/2.4.41

Database: MySQL 10.3.31

PHP version: 7.4.3

Nextcloud version: 22.2.0.2

Updated from an older Nextcloud/ownCloud or fresh install: fresh install

Where did you install Nextcloud from: zip file

Signing status:

Signing status
No errors have been found.

List of activated apps:

App list
If you have access to your command line run e.g.:
Enabled:
  - accessibility: 1.8.0
  - activity: 2.15.0
  - apporder: 0.13.0
  - bruteforcesettings: 2.2.0
  - circles: 22.1.1
  - cloud_federation_api: 1.5.0
  - comments: 1.12.0
  - contacts: 4.0.6
  - contactsinteraction: 1.3.0
  - dashboard: 7.2.0
  - dav: 1.19.0
  - deck: 1.5.5
  - federatedfilesharing: 1.12.0
  - federation: 1.12.0
  - files: 1.17.0
  - files_external: 1.13.0
  - files_pdfviewer: 2.3.0
  - files_rightclick: 1.1.0
  - files_sharing: 1.14.0
  - files_trashbin: 1.12.0
  - files_versions: 1.15.0
  - files_videoplayer: 1.11.0
  - firstrunwizard: 2.11.0
  - groupfolders: 10.0.0
  - logreader: 2.7.0
  - lookup_server_connector: 1.10.0
  - nextcloud_announcements: 1.11.0
  - notes: 4.2.0
  - notifications: 2.10.1
  - oauth2: 1.10.0
  - password_policy: 1.12.0
  - privacy: 1.6.0
  - provisioning_api: 1.12.0
  - quota_warning: 1.11.0
  - recommendations: 1.1.0
  - serverinfo: 1.12.0
  - settings: 1.4.0
  - sharebymail: 1.12.0
  - support: 1.5.0
  - survey_client: 1.10.0
  - systemtags: 1.12.0
  - tasks: 0.14.2
  - text: 3.3.0
  - theming: 1.13.0
  - twofactor_backupcodes: 1.11.0
  - updatenotification: 1.12.0
  - user_status: 1.2.0
  - viewer: 1.6.0
  - weather_status: 1.2.0
  - workflowengine: 2.4.0
Disabled:
  - admin_audit
  - encryption
  - photos
  - user_ldap

Nextcloud configuration:

Config report
{
    "system": {
        "instanceid": "***REMOVED SENSITIVE VALUE***",
        "passwordsalt": "***REMOVED SENSITIVE VALUE***",
        "secret": "***REMOVED SENSITIVE VALUE***",
        "trusted_domains": [
            "***REMOVED SENSITIVE VALUE***"
        ],
        "datadirectory": "***REMOVED SENSITIVE VALUE***",
        "objectstore": {
            "class": "\\OC\\Files\\ObjectStore\\S3",
            "arguments": {
                "bucket": "nextcloud",
                "autocreate": true,
                "key": "***REMOVED SENSITIVE VALUE***",
                "secret": "***REMOVED SENSITIVE VALUE***",
                "hostname": "storage.waw.cloud.ovh.net",
                "port": 443,
                "region": "waw",
                "use_ssl": true,
                "use_path_style": true
            }
        },
        "dbtype": "mysql",
        "version": "22.2.0.2",
        "overwrite.cli.url": "***REMOVED SENSITIVE VALUE***",
        "dbname": "***REMOVED SENSITIVE VALUE***",
        "dbhost": "***REMOVED SENSITIVE VALUE***",
        "dbport": "",
        "dbtableprefix": "oc_",
        "mysql.utf8mb4": true,
        "dbuser": "***REMOVED SENSITIVE VALUE***",
        "dbpassword": "***REMOVED SENSITIVE VALUE***",
        "installed": true,
        "default_phone_region": "PL",
        "mail_from_address": "***REMOVED SENSITIVE VALUE***",
        "mail_smtpmode": "smtp",
        "mail_sendmailmode": "smtp",
        "mail_domain": "***REMOVED SENSITIVE VALUE***",
        "mail_smtpauthtype": "LOGIN",
        "mail_smtpauth": 1,
        "mail_smtphost": "***REMOVED SENSITIVE VALUE***",
        "mail_smtpport": "587",
        "mail_smtpname": "***REMOVED SENSITIVE VALUE***",
        "mail_smtppassword": "***REMOVED SENSITIVE VALUE***",
        "has_rebuilt_cache": true
    }
}

Are you using external storage, if yes which one: S3 Object Storage as default nextcloud data storage

Are you using encryption: no

Are you using an external user-backend, if yes which one: Not sure, probably no

Client configuration

Browser: Chromium-based 95.0.1020.53

Operating system: Windows 10, Windows 11, GNU/Linux

Desktop client version: 3.3.6 (Windows)

Logs

Web server error log

Server error log
[Mon Nov 22 08:56:47.033194 2021] [access_compat:error] [pid 424412] [client 209.141.34.220:46930] AH01797: client denied by server configuration: /var/www/html/config/getuser
[Mon Nov 22 13:53:42.980006 2021] [access_compat:error] [pid 429447] [client 209.141.34.220:49522] AH01797: client denied by server configuration: /var/www/html/config/getuser
[Mon Nov 22 16:55:45.673139 2021] [php7:error] [pid 434037] [client 213.231.8.6:56997] script '/var/www/html/wp-login.php' not found or unable to stat, referer: http://***PRIVATE HOSTNAME***/wp-login.php

Nextcloud log (data/nextcloud.log)

Nextcloud log
Not sure how to get this from S3
@blazejhanzel blazejhanzel added 0. Needs triage Pending check for reproducibility or if it fits our roadmap bug labels Nov 22, 2021
@NeoTheThird
Copy link

NeoTheThird commented Dec 1, 2021

Cannot clear them using occ files:scan --all and occ files:cleanup.

Afaics the files:* commands do not affect object storage as primary storage at all, but it looks like that might be the intended behavior? Maybe changing this (or adding a new command for object storage) would be a potential fix for this, since it would be important to not only prevent new faulty files to appear, but also to get rid of the old ones.

My object storage ballooned to almost four times the size of my users' accumulated used storage due to this issue. To at the very least get rid of some stuff from my object storage, i ran occ trashbin:cleanup --all-users and occ versions:cleanup. That of course does not fix the underlying issue, but it does reduce my hosting bill a little bit (at the cost of some convenience for my users).

@otherguy
Copy link

otherguy commented Feb 6, 2022

I had the same issue #30762 and wrote a cronjob that cleans up these uploads.

I published it here: https://github.com/otherguy/nextcloud-cleanup

It's extremely simple and for now only works with Scaleway's S3 Object storage and MySQL/MariaDB but I'm happy to accept PRs to make it more versatile. The changes required for Amazon's S3 storage would be minimal.

@Scandiravian
Copy link

I've written a python script that does something similar to the cronjob @otherguy has made, but for Minio+Postgres. I made a comment in a related issue with a disclaimer, that I recommend reading before trying it out on you own #20333

@szaimen

This comment was marked as resolved.

@Scandiravian
Copy link

@szaimen Thanks for communicating what you need to move forward with this issue. I appreciate the effort to clean-up the backlog, as I had forgotten about this issue after I fixed the problem that was causing connections to be dropped.

I am not sure if I will have time to reproduce this in the foreseeable future, so for anyone interested in confirming whether this issue is still affecting Nexcloud here's what I think is needed to reproduce the issue:

  1. Spin up Nextcloud, Postgres, and Minio (or another S3 compatible service)
  2. Configure Nextcloud to use S3 as primary storage (relevant docs)
  3. Set a low upload size (10M or similar) in nextcloud/.user.ini (relevan docs)
  4. Create a folder with a single file that is larger than the limit set in step 3
  5. Log in to Nextcloud and delete everything in the default user's files pane
  6. Check that the storage bucket in Minio is now empty (it might be necessary to run garbage collection before the bucket is cleaned up)
  7. Connect the nextcloud-client to the Nextcloud backend
  8. Set the nextcloud-client to sync the folder set in step 4
  9. Confirm that the upload fails through the logs for the nextcloud-client
  10. Stop the nextcloud-client from syncing to the server
  11. Trigger garbage collection
  12. Check the storage bucket in Minio. If it is no longer empty, the bug is still present

I wrote this from memory, so if anyone spots a mistake, let me know and I'll update the steps

@szaimen szaimen closed this as completed Mar 6, 2023
@otherguy
Copy link

otherguy commented Mar 6, 2023

@szaimen is there a changelog that mentions this?

@szaimen
Copy link
Contributor

szaimen commented Mar 6, 2023

Ah sorry, closed this by accident. In which nc version did you reproduce the issue?

@szaimen szaimen reopened this Mar 6, 2023
@otherguy
Copy link

otherguy commented Mar 6, 2023

Definitely up to 24.x

@frittentheke
Copy link

  1. Partial and unsuccessful uploads should certainly be recognized and cleaned away from object storage. So this bug is more about the server not aborting the upload to S3 for a chunk / file not received from the client completely, right?

  2. When doing multipart uploads (see Use MultipartUpload for uploading chunks to s3 #27034) one would usually use a lifecycle policy (https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpu-abort-incomplete-mpu-lifecycle-config.html) to ensure parts of a mutipart upload that is not completed after a certain time frame are deleted

@szaimen

This comment was marked as resolved.

@otherguy
Copy link

@szaimen you asked previously to verify on 24 or 25. I have verified it still happens on 24.

Could you link to a PR or Changelog entry since then that should fix it?

@szaimen szaimen removed the bug label May 13, 2024
@joshtrichards joshtrichards added the needs review Needs review to determine if still applicable label Sep 5, 2024
@joshtrichards joshtrichards changed the title Nextcloud leaves large data files on S3 object storage after bad uploads Large data files left in S3 object storage after bad uploads Sep 8, 2024
@HelderFSFerreira
Copy link

Same issue on 29.0.6

@joshtrichards
Copy link
Member

#20333 (comment)

@joshtrichards joshtrichards added the hotspot: file transfer performance upload & download performance related optimizations label Nov 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0. Needs triage Pending check for reproducibility or if it fits our roadmap feature: object storage hotspot: file transfer performance upload & download performance related optimizations needs review Needs review to determine if still applicable technical debt
Projects
None yet
Development

No branches or pull requests

9 participants