-
-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Use pigz #40
Comments
Rather than hard-coding a specific compression tool, I think instead we should either make configurable, or provide some pre-configured options, as syncoid does: P.S. zstd is way better than pigz if available |
Agree, but pigz is a simple lowest-common-denominator that was going to be the START of the PR, but until I can understand why it's trying to readline() the gzip'ed data, I'm confused 8-\ |
To do it in a correct way, its not as simple as just adding a pipeline somewhere. Also piping "via" python, like we do, doesn't mean python actually processes all the data. (its an actual unix pipe) I agree this should be a feature, just like mbuffer support. This first requires extending ExecuteNode in a way that allows adding piped commands (either locally or remote via ssh). |
Depends on #50 |
i added the same compression options as syncoid. I changed it so zstd uses zsdtmt for multithreading. I would recommend zstd-fast as default compressor. (Its very fast and compresses pretty good) |
As CPUs are massively faster than anything else we have, we should be compressing data before moving it around. pigz is a multi-threaded implementation of gzip, which scales pretty much infinitely.
I've been running an older hacked version of zfs_backup which uses bash -c on the remote machine, and I was going to do a proper PR for this, but there seems to be some strange issue that I can't figure out.
This is my send command, which works fine:
I then use shell=True on the local Popen to allow this as you want to keep python out of the way of this, as it'll just be moving data in and out of memory and slowing things down (the second line is self.debug(encoded_cmd) before the Popen, on line 413):
This looks like the new version is trying to readline.decode(utf-8) the binary data coming from zfs send (or, in this case, zfs send | pigz), and I can't understand why that would be happening.
Any hints?
The text was updated successfully, but these errors were encountered: