-
Notifications
You must be signed in to change notification settings - Fork 299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
modules: consider switching to zstd for modules archives #2744
Comments
I'm not against this per se, but some nuance on a couple of points above:
This makes it sound a bit like module archives will always be fully extracted to disk for CUE evaluation, but I can certainly imagine significant situations where that might not be necessary: for example when evaluating CUE in a non-interactive situation, for example in a browser or for a one-shot server-side evaluation. The zip format arguably could provide some significant performance advantages there, as it could decompress only files involved in the required packages.
I'd be inclined to avoid that, for now at least, because of Russ's reservations about tooling support under Windows. |
How would you display errors? You would need paths/filenames in some form for the sake of debugging. I guess you could somehow treat the zip archive as a directory, but it would break anything that expects absolute paths to actually be files on disk, e.g. CI failure log viewers or terminals/editors where filenames are linkified.
I think this argument goes both ways.
I agree that zip files with zstd compression are likely not the best option - they do marginally improve compression, but as a middle ground solution, it makes noone happy :) |
This is a good question. In general even file names aren't sufficient, because they're relative to the local filesystem which varies from place to place. One possibility is to use some kind of URL notation (not impossible because it might well be possible to point directly to the registry from whence the source came), or use a custom notation that identifies the source module and version. In general, I wouldn't want to make it infeasible to evaluate CUE in situations where there's no available filesystem, and conversely, I think that tying ourselves to file-based error messages is probably a bit too limiting (a temporary file name might mean nothing to a user where a more domain-focused name might be more informative).
Note that Note that I'm not against using |
Fair enough. To be clear, our error messages are already filename-based today, so I'm talking in practical terms about what is already the status quo.
Fair enough again. I don't suspect that module archives will become large today, but it's hard to predict how large they might get in the future. I think we're in general agreement that we're OK with keeping standard zips for our first artifact version For some rough realistic numbers, I ran a quick test of zip vs tar.zst on our latest alpha source archive:
So it seems like standard zip can take up to twice as much space as a well compressed tar.zst. Network and disk space these days is relatively cheap, so I don't think halving the archive size warrants losing I'm also warming up to the idea of zip with zstd compression rather than deflate may be the future, e.g. for a For all the reasons above, I'm happy to ship v0.8.0 as currently implemented, with standard deflate zips. We can consider a v2 with zip+zstd in a few years, for the sake of decent network and disk usage wins, without losing One point we raised with @rogpeppe and @myitcv was to redesign the current modzip package so that it doesn't hard-code assumptions about zip archives, but instead it could be generic to any archive format that is So I'm actually fine with leaving the modzip API as it is currently. I don't think we will move away from zip archives in the next decade, at least not while |
What version of CUE are you using (
cue version
)?Per @mvdan, Go is adding support for zstd compression of their release archives: golang/go#62446 (comment)
Given we've still got plenty of room to make design changes for modules (we are only experimenting right now), maybe it would be a good idea to do zstd from day one and not have to support older/worse compression algorithms like deflate.
Points to consider per @mvdan:
cue/load
as it is now, but also for the LSP by designio/fs
std
doesn't havezstd
yet (compress/zstd: add new package golang/go#62513) but it's only a matter of time. They already have an internal/zstd decompressor, and they'll very soon want a compressor for net/http. in the meantime, we can use https://pkg.go.dev/github.com/klauspost/compress/zstdThe text was updated successfully, but these errors were encountered: