Skip to content
This repository has been archived by the owner on Apr 3, 2023. It is now read-only.

Replace uploader & downloader implementations with rclone and compress before upload #4

Merged
merged 5 commits into from
Aug 28, 2019

Conversation

aylei
Copy link
Contributor

@aylei aylei commented Aug 26, 2019

@xiaojingchen @onlymellb @tennix @weekface PTAL
Signed-off-by: Aylei rayingecho@gmail.com

Benchmark:

Environment: container runs in an AWS m4.xlarge instance (4C 16G) in tokyo with no resources limit. The data is uploaded to and download from S3 in US East Ohio.

Data size: 30.2 GB

rclone implementation:

Upload Time:

  • Compress: 17m 26s
  • Upload: 15m 05s
  • Total: 32m 32.74s

(Compressed size 14.5 GB)

Download Time:

  • Download: 7m 42s
  • Decompress: 10m 07s
  • Total: 17m 40.09s

Resources Usage:

  • Compress: the container stably consumes 400% CPU and 11MB memory
  • Upload: the container stably consumes 30% memory and 23MB memory, avg network tx 16.4MB/s

Go implementation:

Upload Time: 38m 10s
Download Time: 54m 47s

Tested upon s3 & ceph manually.

s3 log:

docker run -it --entrypoint sh aylei:test
/ # mkdir -p /data/backup/
/ # cd /data/backup
/data/backup # echo 111 > sql1
/data/backup # echo 222 > sql2
/data/backup # cd /
/ # uploader  --cloud=aws --region=us-east-2 --bucket=new-bucket-aylei --backup-dir=/data/backup
+ basename /usr/local/bin/uploader
+ export 'RUN_MODE=uploader'
+ export 'CLOUD=aws'
+ shift
+ export 'REGION=us-east-2'
+ shift
+ export 'BUCKET=new-bucket-aylei'
+ shift
+ export 'BACKUP_DIR=/data/backup'
+ shift
+ '[' aws '=' gcp ]
+ '[' aws '=' ceph ]
+ '[' aws '=' aws ]
+ '[' -z aaaa ]
+ '[' -z xxxxx ]
+ cat
+ '[' uploader '=' uploader ]
+ dirname /data/backup
+ export 'BACKUP_BASE_DIR=/data'
+ basename /data/backup
+ tar cvzf /data/backup.tgz -C /data backup
backup/
backup/sql1
backup/sql2
+ rclone --config /tmp/rclone.conf copyto /data/backup.tgz aws:new-bucket-aylei//data/backup/backup.tgz
/usr/local/bin/uploader: line 1: rclone: not found
/ # mv rclone /usr/local/bin/rclone
/ # uploader  --cloud=aws --region=us-east-2 --bucket=new-bucket-aylei --backup-dir=/data/backup
+ basename /usr/local/bin/uploader
+ export 'RUN_MODE=uploader'
+ export 'CLOUD=aws'
+ shift
+ export 'REGION=us-east-2'
+ shift
+ export 'BUCKET=new-bucket-aylei'
+ shift
+ export 'BACKUP_DIR=/data/backup'
+ shift
+ '[' aws '=' gcp ]
+ '[' aws '=' ceph ]
+ '[' aws '=' aws ]
+ '[' -z aaaaa ]
+ '[' -z ccccc ]
+ cat
+ '[' uploader '=' uploader ]
+ dirname /data/backup
+ export 'BACKUP_BASE_DIR=/data'
+ basename /data/backup
+ tar cvzf /data/backup.tgz -C /data backup
backup/
backup/sql1
backup/sql2
+ rclone --config /tmp/rclone.conf copyto /data/backup.tgz aws:new-bucket-aylei//data/backup/backup.tgz
/ # downloader --cloud=aws --region=us-east-2 --bucket=new-bucket-aylei --srcDir=/data/backup --destDir=/data
+ basename /usr/local/bin/downloader
+ export 'RUN_MODE=downloader'
+ export 'CLOUD=aws'
+ shift
+ export 'REGION=us-east-2'
+ shift
+ export 'BUCKET=new-bucket-aylei'
+ shift
+ export 'BACKUP_DIR=/data/backup'
+ shift
+ export 'DEST_DIR=/data'
+ shift
+ '[' aws '=' gcp ]
+ '[' aws '=' ceph ]
+ '[' aws '=' aws ]
+ '[' -z aaaaa ]
+ '[' -z ccccc ]
+ cat
+ '[' downloader '=' uploader ]
+ '[' downloader '=' downloader ]
+ rclone --config /tmp/rclone.conf copyto aws:new-bucket-aylei//data/backup /data
+ '[' -f /data/backup.tgz ]
+ tar xzvf /data/backup.tgz -C /data
backup/
backup/sql1
backup/sql2
+ rm /data/backup.tgz
/ # cd /data
/data # cd backup/
/data/backup # ls
sql1  sql2

Note that the new image can read the backup taken before, but the backup data taken by the new image cannot be properly de-compressed by the old image.

GCP test has not been done yet.

Signed-off-by: Aylei <rayingecho@gmail.com>
@CLAassistant
Copy link

CLAassistant commented Aug 26, 2019

CLA assistant check
All committers have signed the CLA.

Signed-off-by: Aylei <rayingecho@gmail.com>
@aylei aylei requested a review from weekface August 26, 2019 09:16
@aylei aylei changed the title Replace uploader & downloader implementations with rclone Replace uploader & downloader implementations with rclone and compress before upload Aug 26, 2019
backup.sh Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
backup.sh Outdated Show resolved Hide resolved
backup.sh Outdated Show resolved Hide resolved
backup.sh Outdated Show resolved Hide resolved
backup.sh Show resolved Hide resolved
backup.sh Outdated Show resolved Hide resolved
backup.sh Outdated Show resolved Hide resolved
backup.sh Outdated Show resolved Hide resolved
Signed-off-by: Aylei <rayingecho@gmail.com>
Signed-off-by: Aylei <rayingecho@gmail.com>
@aylei
Copy link
Contributor Author

aylei commented Aug 26, 2019

@onlymellb @xiaojingchen @tennix @weekface All comments addressed, PTAL

@gregwebs
Copy link

Did you benchmark the two implementations?

@aylei
Copy link
Contributor Author

aylei commented Aug 26, 2019

@gregwebs Not yet, I've found a data set with compressed size ~15 GiB, hopefully I can do a simple benchmark tomorrow.

Copy link
Contributor

@onlymellb onlymellb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Dockerfile Outdated Show resolved Hide resolved
Copy link
Member

@tennix tennix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM

Signed-off-by: Aylei <rayingecho@gmail.com>
@aylei
Copy link
Contributor Author

aylei commented Aug 27, 2019

@tennix Addressed, PTAL again

@aylei
Copy link
Contributor Author

aylei commented Aug 27, 2019

@gregwebs I've posted a simple benchmark but I'm really sorry that I don't have enough time these days to do a detailed benchmark

Copy link

@gregwebs gregwebs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

benchmark is good enough for me

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants