Compress toolchains in parallel fashion via `gzp` crate #127

Xanewok · 2021-12-14T19:32:51Z

No description provided.

Xanewok · 2021-12-15T21:44:51Z

By default it seems that ParCompressBuilder::new() uses a compression level of 3, whereas our ST implementation used a level 6, so that may be the reason we're seeing such a big of a speed up. I'll change this to 6 manually so we can compare apples to apples.

Xanewok · 2021-12-16T12:56:16Z

Here's a breakdown:

	master (`7922bd6`)	first commit (`597e6b5`)	second commit (`2d962e0`)
time (Run tests)	5m 42s (link)	2m 15s (link)	2m 51s (link)
compression level	6	3	6
crate	`flate2` (ST)	`gzp` (MT)	`gzp` (MT)

We measure dist-test runs because for each test_dist_* we start with a fresh directory containing no toolchains, and so in each run we package/compress the relevant toolchain and submit it to the test build server.

It might be also useful to measure size differences to further compare compression levels 3 and 6 at some point but it's clear that savings are there for the same compression level of 6.

I've read that due to the nature of DEFLATE format, the parallel decompression doesn't scale up with number of threads as nicely as compression, so I've only touched the compression code but I'd need to investigate further whether decompressing in parallel will really not give us a meaningful speedups.

@sylvestre would you be interested in this patch for mozilla/sccache?

sylvestre · 2021-12-16T13:00:37Z

@Xanewok why not ? :)
Are you aware of any drawback?

Xanewok · 2021-12-16T13:30:34Z

One thing that immediately comes to mind is thread saturation - by default it uses num_threads: num_cpus::get() for the compression. Maybe it'd be better to use, let's say, half of that in our scenario?

drahnr · 2021-12-16T13:51:11Z

One thing that immediately comes to mind is thread saturation - by default it uses num_threads: num_cpus::get() for the compression. Maybe it'd be better to use, let's say, half of that in our scenario?

Since the entire compilation will block on it, I'd recommend to use the full CPU set

sylvestre · 2021-12-16T13:52:33Z

Would it be possible to add a test to verify that it works accordingly ?

sylvestre · 2021-12-16T13:54:15Z

maybe it is overkill but maybe an argument in sccache could be added to control this?

Xanewok · 2021-12-16T13:56:27Z

Would it be possible to add a test to verify that it works accordingly ?

If the toolchain will not be packaged and compressed correctly, the existing dist test suite will fail because we'll fall back to local compilation (something that's checked already)

Xanewok · 2021-12-16T14:00:51Z

maybe it is overkill but maybe an argument in sccache could be added to control this?

Maybe conditional compilation/a feature flag would be better in this case? So if gzp is enabled then we use the parallel compression, otherwise we fallback to ST flate2 (this has the added benefit of not compiling additional deps if the user wants to use ST version)

drahnr · 2021-12-16T14:03:12Z

This might the path to go for sccache, but we don't want to add extra complexity in cachepot when we didn't push our first release to crates.io just yet :)

* Compress toolchains in parallel fashion via `gzp` crate * Use previous compression level of 6

Compress toolchains in parallel fashion via gzp crate

597e6b5

Xanewok marked this pull request as ready for review December 15, 2021 21:45

Use previous compression level of 6

2d962e0

drahnr approved these changes Dec 16, 2021

View reviewed changes

Xanewok mentioned this pull request Dec 16, 2021

Compress toolchains in parallel fashion via gzp crate mozilla/sccache#1081

Merged

Merge branch 'master' into igor-par-compress

7818fb5

Xanewok merged commit 9148e4e into master Dec 16, 2021

Xanewok deleted the igor-par-compress branch December 16, 2021 23:28

montekki pushed a commit that referenced this pull request Dec 20, 2021

Compress toolchains in parallel fashion via gzp crate (#127)

7b25289

* Compress toolchains in parallel fashion via `gzp` crate * Use previous compression level of 6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compress toolchains in parallel fashion via `gzp` crate #127

Compress toolchains in parallel fashion via `gzp` crate #127

Xanewok commented Dec 14, 2021

Xanewok commented Dec 15, 2021

Xanewok commented Dec 16, 2021

sylvestre commented Dec 16, 2021

Xanewok commented Dec 16, 2021

drahnr commented Dec 16, 2021

sylvestre commented Dec 16, 2021

sylvestre commented Dec 16, 2021

Xanewok commented Dec 16, 2021

Xanewok commented Dec 16, 2021

drahnr commented Dec 16, 2021

Compress toolchains in parallel fashion via gzp crate #127

Compress toolchains in parallel fashion via gzp crate #127

Conversation

Xanewok commented Dec 14, 2021

Xanewok commented Dec 15, 2021

Xanewok commented Dec 16, 2021

sylvestre commented Dec 16, 2021

Xanewok commented Dec 16, 2021

drahnr commented Dec 16, 2021

sylvestre commented Dec 16, 2021

sylvestre commented Dec 16, 2021

Xanewok commented Dec 16, 2021

Xanewok commented Dec 16, 2021

drahnr commented Dec 16, 2021

Compress toolchains in parallel fashion via `gzp` crate #127

Compress toolchains in parallel fashion via `gzp` crate #127