-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/distpack: offer Zstandard-compressed archives in addition to gzip #62446
Comments
CC @golang/release |
Those are pretty compelling numbers. At least on my machine, with tar 1.34, The implementation detail is not so trivial. Creating release archives is now the responsibility of https://cs.opensource.google/go/go/+/master:src/cmd/distpack/pack.go, and we want them to be completely deterministic, which means using a compression algorithm that we can hold constant for the lifetime of a Go release. (See the associated blog post). We'd need to pull a zstd implementation into the distribution, either as a standard library package (unlikely), an internal package we own (time-consuming to write, unless someone wants to contribute it), or vendor something that looks solid (seems fine?). Overall I'm in favor of this, it seems like a moderate amount of effort and pretty much a pure win for users. |
Alternatively, rather than freezing it at the Go package layer, you could rely on |
@heschi Just a note that there is a package that we could vendor if we go that route: github.com/klauspost/compress/zstd. |
FWIW But to be fair it does have a bigger window size. Without the same it is 49.83MB - but there isn't too much reason to have the small window, if you are that resource constrained just use gz. |
As Heschi notes, the relevant code needs to live or be vendored into the Go tree so that we can reproduce the archives bit-for-bit even far into the future. We could do that, but it increases the cost. Shelling out to a separate tool that isn't versioned in the Go repo is not an option. We'd also have to update gorebuild to verify zstd as well. In the long term we may end up with zstd vendored anyway, or perhaps even added to the standard library. I'm OK with vendoring it for use in cmd/dist. That said, it will require work on the release team's part, and we may not have bandwidth for reviewing and deploying such a change in the near future. But in the abstract it sounds reasonable to me. |
If someone's interested in moving this forward, I think the steps are to vendor a zstd implementation, add support to There are two other kinds of artifacts not covered by this proposal: Windows distribution archives and toolchain module files, both .zip files. Wikipedia says that zip standardized zstd support a few years ago, so it's theoretically possible to make this change to both. For Windows, it would be interesting to survey implementations and see how usable a more advanced compression would be. For the toolchain module files, we'd need to teach the Go command to understand them, and (per discussion with Russ) probably start publishing a second series of archives, |
Yeah; No. Using the Windows 11 built-in extraction tool s with zstd in a ZIP file just gives an |
This proposal has been added to the active column of the proposals project |
Are there any objections to adding this? |
Based on the discussion above, this proposal seems like a likely accept. Add .tar.zst archives anywhere we generate .tar.gz archives in cmd/distpack. In the longer term, this could be a step toward zstd-compressed modules, |
No change in consensus, so accepted. 🎉 Add .tar.zst archives anywhere we generate .tar.gz archives in cmd/distpack. In the longer term, this could be a step toward zstd-compressed modules, |
Out of curiosity, would the thinking there be to keep the module archives as ZIP, but swap the compression algorithm to zstd, or to switch to something else entirely like The latter is more standard in terms of zstd compression, and will give a better compression ratio since all files are compressed together, but we would lose the ablity to seek through files without decompressing. I suspect that's not a problem, given that GOPROXY serves |
This is inspired by #62445, where @dsnet proposes using zopfli to create ~6% smaller .gz downloads for Go release downloads.
As he writes in that issue:
This proposal is to help usher in that future by offering zstd downloads in addition to gzip.
Here's a very quick'n'dirty comparison of compression performance on the same
go1.21.0.linux-amd64.tar.gz
archive Joe looked at:Also, decompressing the .zst archives takes about 4x less CPU time than decompressing the .gz archives on my machine.
If we offered .gz and .zst, people who care at all about size and speed can just use .zst and get a much bigger benefit than if we had zopfli-encoded .gzs.
Footnotes
This is an estimate based on the fact that the file size falls between
gzip -5
andgzip -6
. I think that the actual release process usescompress/gzip
which is quite a bit slower. ↩The text was updated successfully, but these errors were encountered: