Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding timestamps on S3 metadata #132

Open
wants to merge 24 commits into
base: master
Choose a base branch
from
Open

Conversation

ivanmp91
Copy link

@ivanmp91 ivanmp91 commented Mar 12, 2019

Included changes proposed on this PR #93 and also adding the following changes:

  • Keep mtime and atime during file restoration (they're stored as metadata on S3), that's useful for incremental backups if you need to restore specific sstables for a time period given.

  • Include S3 connection host for restoration workers. Needed for S3 buckets that requires signature version 4.

rhardouin and others added 23 commits February 1, 2017 15:58
The data dir already points to the "data" directory.
If --no-sstableloader option is set then files will just be downloaded.
Add verbose output to know which file is processed.
--local allows to restore data directly on the local server where the command is run.
The filenames are not prefixed by `<HOST>_` because we restore from only one node in this mode, so it would be useless.

Rename --cassandra-data-dir to --restore-dir because:
 * it's just a temporary directory when restoring with sstableloader
 * it's safer to download data in a different location than the Cassandra data directory
   as a first step when doing a local restore with --local
Fixed conflicts with current README on master
…s the timestamp of the modification time of the file on the origin filesystem.
…s the timestamp of the modification time of the file on the origin filesystem.
@ivanmp91
Copy link
Author

ivanmp91 commented Mar 12, 2019

Also tested a local restore and working fine:

$ cassandra-snapshotter --s3-bucket-name=ck-test-cassandra --s3-bucket-region=us-east-2 --s3-base-path=backups restore --keyspace convertkit --snapshot-name 20190312130712 --hosts 172.21.175.231 --restore-dir /opt/cassandra/data/restores/ --local
Found 106 files, with total size of 10.3GB.
lzop: /opt/cassandra/data/restores/convertkit/subscriber_events-b233856036cc11e9bbeff17f618959fb/mc-135-big-CompressionInfo.db: already exists; not overwritten
ERROR:root:lzop Out: None
Error:None
Exit Code 1:
lzop: /opt/cassandra/data/restores/convertkit/subscriber_events-b233856036cc11e9bbeff17f618959fb/mc-135-big-Data.db: already exists; not overwritten
ERROR:root:Unable to create "/opt/cassandra/data/restores/convertkit/subscriber_events-b233856036cc11e9bbeff17f618959fb/mc-135-big-Data.db.lzo": [Errno 32] Broken pipe
lzop: /opt/cassandra/data/restores/convertkit/subscriber_events-b233856036cc11e9bbeff17f618959fb/mc-135-big-Digest.crc32: already exists; not overwritten
ERROR:root:lzop Out: None
Error:None
Exit Code 1:
lzop: /opt/cassandra/data/restores/convertkit/subscriber_events-b233856036cc11e9bbeff17f618959fb/mc-135-big-Filter.db: already exists; not overwritten
ERROR:root:lzop Out: None
Error:None
Exit Code 1:
lzop: /opt/cassandra/data/restores/convertkit/subscriber_events-b233856036cc11e9bbeff17f618959fb/mc-135-big-Index.db: already exists; not overwritten
ERROR:root:Unable to create "/opt/cassandra/data/restores/convertkit/subscriber_events-b233856036cc11e9bbeff17f618959fb/mc-135-big-Index.db.lzo": [Errno 32] Broken pipe
lzop: /opt/cassandra/data/restores/convertkit/subscriber_events-b233856036cc11e9bbeff17f618959fb/mc-135-big-Statistics.db: already exists; not overwritten
ERROR:root:lzop Out: None
Error:None
Exit Code 1:
lzop: /opt/cassandra/data/restores/convertkit/subscriber_events-b233856036cc11e9bbeff17f618959fb/mc-135-big-Summary.db: already exists; not overwritten
ERROR:root:lzop Out: None
Error:None
Exit Code 1:
lzop: /opt/cassandra/data/restores/convertkit/subscriber_events-b233856036cc11e9bbeff17f618959fb/mc-135-big-TOC.txt: already exists; not overwritten
ERROR:root:lzop Out: None
Error:None
Exit Code 1:
1.6GB / 10.3GB (15.57))

Those already exists; errors are normal since incremental backups are downloaded first and are not overwritten by the sstables took by the snapshot, think this behaviour is correct for a usual full restore so I think it's fine. Also restored sstables keep their access and modified times from original filesystem:

$ pwd
/opt/cassandra/data/restores/convertkit/subscriber_events-b233856036cc11e9bbeff17f618959fb
$ ls -lrt
total 11087396
-rw-r--r-- 1 cassandra cassandra 92 Mar 6 10:27 mc-95-big-TOC.txt
-rw-r--r-- 1 cassandra cassandra 2783316 Mar 6 10:27 mc-95-big-Summary.db
-rw-r--r-- 1 cassandra cassandra 10338 Mar 6 10:27 mc-95-big-Statistics.db
-rw-r--r-- 1 cassandra cassandra 325539088 Mar 6 10:27 mc-95-big-Index.db
-rw-r--r-- 1 cassandra cassandra 10107992 Mar 6 10:27 mc-95-big-Filter.db
-rw-r--r-- 1 cassandra cassandra 10 Mar 6 10:27 mc-95-big-Digest.crc32
-rw-r--r-- 1 cassandra cassandra 9190671053 Mar 6 10:27 mc-95-big-Data.db
-rw-r--r-- 1 cassandra cassandra 1771491 Mar 6 10:27 mc-95-big-CompressionInfo.db
-rw-r--r-- 1 cassandra cassandra 92 Mar 6 20:30 mc-100-big-TOC.txt
-rw-r--r-- 1 cassandra cassandra 384436 Mar 6 20:30 mc-100-big-Summary.db
-rw-r--r-- 1 cassandra cassandra 10314 Mar 6 20:30 mc-100-big-Statistics.db
-rw-r--r-- 1 cassandra cassandra 44484182 Mar 6 20:30 mc-100-big-Index.db
-rw-r--r-- 1 cassandra cassandra 1382464 Mar 6 20:30 mc-100-big-Filter.db
-rw-r--r-- 1 cassandra cassandra 10 Mar 6 20:30 mc-100-big-Digest.crc32
-rw-r--r-- 1 cassandra cassandra 597376390 Mar 6 20:30 mc-100-big-Data.db
-rw-r--r-- 1 cassandra cassandra 113587 Mar 6 20:30 mc-100-big-CompressionInfo.db
-rw-r--r-- 1 cassandra cassandra 92 Mar 7 06:43 mc-105-big-TOC.txt
-rw-r--r-- 1 cassandra cassandra 390596 Mar 7 06:43 mc-105-big-Summary.db
-rw-r--r-- 1 cassandra cassandra 10314 Mar 7 06:43 mc-105-big-Statistics.db
-rw-r--r-- 1 cassandra cassandra 45187241 Mar 7 06:43 mc-105-big-Index.db
-rw-r--r-- 1 cassandra cassandra 1409640 Mar 7 06:43 mc-105-big-Filter.db
-rw-r--r-- 1 cassandra cassandra 9 Mar 7 06:43 mc-105-big-Digest.crc32
-rw-r--r-- 1 cassandra cassandra 591954100 Mar 7 06:43 mc-105-big-Data.db
-rw-r--r-- 1 cassandra cassandra 113179 Mar 7 06:43 mc-105-big-CompressionInfo.db
-rw-r--r-- 1 cassandra cassandra 92 Mar 7 09:15 mc-106-big-TOC.txt
-rw-r--r-- 1 cassandra cassandra 170552 Mar 7 09:15 mc-106-big-Summary.db
-rw-r--r-- 1 cassandra cassandra 10314 Mar 7 09:15 mc-106-big-Statistics.db

What else do you need @rhardouin @tbarbugli to get this merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants