Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-Part upload to Azure Blob causes growing files #468

Closed
BasJ93 opened this issue Oct 27, 2022 · 4 comments
Closed

Multi-Part upload to Azure Blob causes growing files #468

BasJ93 opened this issue Oct 27, 2022 · 4 comments
Labels

Comments

@BasJ93
Copy link

BasJ93 commented Oct 27, 2022

We've deployed S3Proxy to an AKS cluster to be the proxy in front of Azure Storage. Half the blobs we upload are below the default 4MB limit for multi-part uploads to Azure, the other blobs start at 25MB.

These 25MB files are causing us some problems, namely when we push these through the proxy they increase in size. It appears the data is appended to the already stored blob instead of the blob being replaced.

We are currently working around this problem by first sending a delete command, but would prefer that the functionality works as expected.

When we test the same application code again Min.IO we do not observe this growing file size problem, nor when we directly use the Azure API.

Perhaps we missed some configuration option? If not, have we discovered a bug?

@gaul
Copy link
Owner

gaul commented Oct 27, 2022

Can you give more specific steps to reproduce your symptoms ad the expected behavior? I don't understand what is happening.

@gaul gaul added the needinfo label Oct 27, 2022
@BasJ93
Copy link
Author

BasJ93 commented Oct 31, 2022

@gaul Of course, please see if this helps.

Expected behavior

  • Upload file
  • Download file
  • Verify files are the same size
  • Upload file again (replacing the original file)
  • Download file
  • Verify files are the same size

Observed behavior

  • Upload file
  • Download file
  • Verify files are the same size
  • Upload file again (replacing the original file)
  • Download file
  • File in storage has now increased with the size of the newest upload.

Configuration

We've deployed s3proxy to an AKS cluster in this deployment:

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: s3proxy
  namespace: s3proxy
spec:
  replicas: 1
  selector:
    matchLabels:
      app: s3proxy
  template:
    metadata:
      labels:
        app: s3proxy
    spec:
      containers:
      - name: s3proxy
        image: andrewgaul/s3proxy:latest
        imagePullPolicy: Always
        ports:
        - containerPort: 80
        env:
        - name: LOG_LEVEL
          value: trace
        - name: JCLOUDS_PROVIDER
          value: azureblob
        - name: JCLOUDS_IDENTITY
          valueFrom:
            secretKeyRef:
              name: azure-credentials
              key: accesskey
        - name: JCLOUDS_CREDENTIAL
          valueFrom:
            secretKeyRef:
              name: azure-credentials
              key: secretkey
        - name: S3PROXY_IDENTITY
          valueFrom:
            secretKeyRef:
              name: proxy-credentials
              key: accesskey
        - name: S3PROXY_CREDENTIAL
          valueFrom:
            secretKeyRef:
              name: proxy-credentials
              key: secretkey

Steps to reproduce

To reproduce the effect, I've taken the mc client (which we also use to access s3proxy) and copied it to the storage. The first copy works as expected: I upload 24.11 MiB and then download 24.11MiB.

When I then copy the exact same file to the exact same location, the file doubles in size: I upload 24.11MiB but suddenly download 48.21MiB. See the output below.

root@my-shell:/# ./mc cp /home/mc s3proxy/test
/home/mc:      24.11 MiB / 24.11 MiB ━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.58 MiB/s 15s
root@my-shell:/# ./mc cp s3proxy/test/mc /home/mc2
...proxy/test/mc: 24.11 MiB / 24.11 MiB ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.15 MiB/s 21s
root@my-shell:/# ./mc cp /home/mc s3proxy/test
/home/mc:         24.11 MiB / 24.11 MiB ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.62 MiB/s 14s
root@my-shell:/# ./mc cp s3proxy/test/mc /home/mc3
...proxy/test/mc: 48.21 MiB / 48.21 MiB ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.15 MiB/s 41s

So, steps to reproduce:

  • Select any file larger then 4MB and upload it through s3proxy to Azure Storage.
  • Download the same file to verify its size.
  • Upload the same file again to the same location with the same name.
  • Download the file again and check its file size, see that it has doubled.

Additional remarks

We do not observe this effect when using MinIO without s3proxy of when directly uploading files to Azure storage without s3proxy.

@BasJ93
Copy link
Author

BasJ93 commented Jan 31, 2023

@gaul, were you able to reproduce this issue?

@gaul gaul added the azure label Nov 5, 2024
gaul added a commit that referenced this issue Nov 10, 2024
@gaul
Copy link
Owner

gaul commented Nov 10, 2024

JCLOUDS-1639 fixes this if you override jclouds.version to 2.6.1-SNAPSHOT in pom.xml. I will try to run a new jclouds release in the next month or two. Sorry for ignoring this issue for so long!

@gaul gaul removed the needinfo label Nov 10, 2024
@gaul gaul closed this as completed in 50715e0 Jan 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants