Skip to content
This repository has been archived by the owner on Feb 28, 2023. It is now read-only.

Compress toolchains in parallel fashion via gzp crate #127

Merged
merged 3 commits into from
Dec 16, 2021

Conversation

Xanewok
Copy link
Contributor

@Xanewok Xanewok commented Dec 14, 2021

No description provided.

@Xanewok
Copy link
Contributor Author

Xanewok commented Dec 15, 2021

By default it seems that ParCompressBuilder::new() uses a compression level of 3, whereas our ST implementation used a level 6, so that may be the reason we're seeing such a big of a speed up. I'll change this to 6 manually so we can compare apples to apples.

@Xanewok Xanewok marked this pull request as ready for review December 15, 2021 21:45
@Xanewok
Copy link
Contributor Author

Xanewok commented Dec 16, 2021

Here's a breakdown:

master (7922bd6) first commit (597e6b5) second commit (2d962e0)
time (Run tests) 5m 42s (link) 2m 15s (link) 2m 51s (link)
compression level 6 3 6
crate flate2 (ST) gzp (MT) gzp (MT)

We measure dist-test runs because for each test_dist_* we start with a fresh directory containing no toolchains, and so in each run we package/compress the relevant toolchain and submit it to the test build server.

It might be also useful to measure size differences to further compare compression levels 3 and 6 at some point but it's clear that savings are there for the same compression level of 6.

I've read that due to the nature of DEFLATE format, the parallel decompression doesn't scale up with number of threads as nicely as compression, so I've only touched the compression code but I'd need to investigate further whether decompressing in parallel will really not give us a meaningful speedups.

@sylvestre would you be interested in this patch for mozilla/sccache?

@sylvestre
Copy link

@Xanewok why not ? :)
Are you aware of any drawback?

@Xanewok
Copy link
Contributor Author

Xanewok commented Dec 16, 2021

One thing that immediately comes to mind is thread saturation - by default it uses num_threads: num_cpus::get() for the compression. Maybe it'd be better to use, let's say, half of that in our scenario?

@drahnr
Copy link
Contributor

drahnr commented Dec 16, 2021

One thing that immediately comes to mind is thread saturation - by default it uses num_threads: num_cpus::get() for the compression. Maybe it'd be better to use, let's say, half of that in our scenario?

Since the entire compilation will block on it, I'd recommend to use the full CPU set

@sylvestre
Copy link

Would it be possible to add a test to verify that it works accordingly ?

@sylvestre
Copy link

maybe it is overkill but maybe an argument in sccache could be added to control this?

@Xanewok
Copy link
Contributor Author

Xanewok commented Dec 16, 2021

Would it be possible to add a test to verify that it works accordingly ?

If the toolchain will not be packaged and compressed correctly, the existing dist test suite will fail because we'll fall back to local compilation (something that's checked already)

@Xanewok
Copy link
Contributor Author

Xanewok commented Dec 16, 2021

maybe it is overkill but maybe an argument in sccache could be added to control this?

Maybe conditional compilation/a feature flag would be better in this case? So if gzp is enabled then we use the parallel compression, otherwise we fallback to ST flate2 (this has the added benefit of not compiling additional deps if the user wants to use ST version)

@drahnr
Copy link
Contributor

drahnr commented Dec 16, 2021

This might the path to go for sccache, but we don't want to add extra complexity in cachepot when we didn't push our first release to crates.io just yet :)

@Xanewok Xanewok merged commit 9148e4e into master Dec 16, 2021
@Xanewok Xanewok deleted the igor-par-compress branch December 16, 2021 23:28
montekki pushed a commit that referenced this pull request Dec 20, 2021
* Compress toolchains in parallel fashion via `gzp` crate

* Use previous compression level of 6
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants