Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SquashFS use zstd as default #177

Merged
merged 4 commits into from
May 18, 2021
Merged

Conversation

siboehm
Copy link
Contributor

@siboehm siboehm commented May 13, 2021

Closes #176

@anaconda-issue-bot anaconda-issue-bot added the cla-signed [bot] added once the contributor has signed the CLA label May 13, 2021
@dbast
Copy link
Member

dbast commented May 15, 2021

@siboehm This looks really good. Thanks for investigating and polishing the compression settings.

Out of scope of this PR, but still in the context of polishing the SquashFS feature: Would it make sense to also add a check to the constructor of SquashFSArchive to check that the mksquashfs command is available with a message ("Please make sure mksquashfs is available, e.g. install it via conda install squashfs-tools) and exit ... or maybe instead handle the check_call call FileNotFoundError exception? That would be great from user perspective ... And that is better than adding squashfs-tools as run dependencies, as those are not available on pypi and on conda-forge not for all platforms.

@dbast
Copy link
Member

dbast commented May 16, 2021

This PR has one drawback: While previously one could select between gzip, lzo and xz as squashfs compression algorithm by choosing different levels, it is then only zstd and xz. During creation that is maybe not a problem as e.g. the conda-forge::squashfs-tools are compiled with gzip, lzo, lz4, zstd and xz activated. But when trying to mount the conda-packed environment on another machine the existing kernel (afaik kernel >=4.14) or squashfuse have to be recent enough.

@siboehm
Copy link
Contributor Author

siboehm commented May 16, 2021

That's true. We need squashfuse >=1.101 (released 12.2017) or kernel >=4.14 (released 11.2017) for zstd. However with xz we still have a compression option that works as far back as Kernel 2.6 (03.2011).

I don't want to change the API and I want it to be easy for users to get good performance without having to tweak anything. We could introduce a gzip option somewhere in the compression_level hierarchy, however I'd like to keep the default as zstd. Personally I think xz and squashfuse are enough options for working around an old kernel.

@dbast
Copy link
Member

dbast commented May 17, 2021

Looking at current/LTS versions of still supported Linux distributions (Debian 9/10, CentOS 7/8, Ubuntu 18.04/20.04), shows that not all have a squashfuse version with zstd support or a recent enough kernel, e.g Debian 10 has version 0.1.103 but compiled without zstd support, Ubuntu 18.04 has 0.1.100 without zstd. Also the Linux kernels of Debian 9 and CentOS 7 are potentially (if nothing got backported) too old.

I understand that having zstd as default would be technically super cool, while there are still tons of conservative managed machines running somewhere with older versions.

Long story short to prevent any surprises, why not present the full information to the user by adding e.g. a print statement in case of zstd (the default) with "Using zstd level {} for best compression / speed ratio, which requires kernel >=4.14 or squashfuse >=0.1.101 compiled with zstd support for mounting (Select --compress-level 9, which uses xz, for compatibility with older versions.)"?

@siboehm
Copy link
Contributor Author

siboehm commented May 18, 2021

Ok I tried to be explicit about the downside of zstd in the Documentation, plus I added a log message.

conda_pack/formats.py Outdated Show resolved Hide resolved
This way the cleanup of tempfiles still runs.
@dbast dbast merged commit ee3c0b4 into conda:master May 18, 2021
@dbast
Copy link
Member

dbast commented May 18, 2021

@siboehm Thanks!

@github-actions github-actions bot added the locked [bot] locked due to inactivity label May 19, 2022
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 19, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
cla-signed [bot] added once the contributor has signed the CLA locked [bot] locked due to inactivity
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SquashFS picking a compression algorithm
3 participants