Ability to rsync zipped directory with gcs ? #1765
Unanswered
merzak274j
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
Thanks for this great library.
I’m wondering if it’s possible to use rsync for my use case. So far, after trying a few different methods I have not been successful. I’ve also reviewed the Q&A and issues and wasn’t able to find a clear example of what I’m trying to do.
I would like to sync a zipped directory hosted on an FTP server, to an unzipped directory on GCS.
Rsync would decompress and copy files from ftp://user:pass@ftp.example.com/path/to/archive.zip to gs://my-bucket/targetdirectory with update condition set to never.
Is this supported? And if so would it be possible to get an example of the syntax required ? So far my efforts all result in a StartsWith error or an IsADirectoryError
Using different syntax I’ve tried passing the ftp url ending in .zip using URL chaining as shown here URL Chaining which resulted in a protocol not recognized (zip::ftp) or opening the ftp with ZipFileSystem and setting source as (myzipfilesysteminstance, “”) or “/“ to represent the root even thought rsync needs a string.
My zip file is quite large (several GB) and contains tens of thousands of files, so I’m looking for a clean and efficient solution that may also leverage concurrency to sync the directories, which rsync seems to provide.
I am somewhat new to programming, and so I’m not sure if what I want is technically possible or if not how I could go about this using fsspec, maybe using copy with many files to a directory; any help would be appreciated.
Beta Was this translation helpful? Give feedback.
All reactions