Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically purge target directories after reaching max size #346

Open
pietroalbini opened this issue Oct 15, 2018 · 7 comments
Open

Automatically purge target directories after reaching max size #346

pietroalbini opened this issue Oct 15, 2018 · 7 comments
Labels
A-runner Area: the graph runner C-enhancement Category: enhancement to an existing feature

Comments

@pietroalbini
Copy link
Member

At the moment the target directories used by Crater don't have a size limit, so they reach hundreds of gigabytes in size, forcing us to have 4TB disks on the agents. We should implement a way to keep them within a configurable size.

@aidanhs suggested removing them after the configured size is reached. This will slow down the runs a lot, but if we keep the max size high enough we can maybe clear them max 1 or 2 times during a run, which shouldn't hit the speed too much.

@pietroalbini pietroalbini added C-enhancement Category: enhancement to an existing feature A-runner Area: the graph runner labels Oct 15, 2018
@Eh2406
Copy link
Contributor

Eh2406 commented Jan 9, 2019

suggested removing them after the configured size is reached.

Is there a way of determining the size of a folder that is faster than iterating over all the files and summing the size?

If so then we can run Cargo-Sweep each time we hit the configured size, it keeps only artifacts newer then a timestamp.

If not I can add a argument to Cargo-Sweep to remove older artifacts until the folder is under a target size.

Also do you happen to know if the target folder support "last access time"? If like my computer it does not then the next step is on Cargo, if it does support it then I can make progress with Cargo-Sweep.

@pietroalbini
Copy link
Member Author

Is there a way of determining the size of a folder that is faster than iterating over all the files and summing the size?

The easiest way is to do whatever cleanup routine we choose after we reach, let's say, 90% of total disk usage on the partition. Querying the free space on a partition should be instantaneous I think.

If not I can add a argument to Cargo-Sweep to remove older artifacts until the folder is under a target size.

This would be great!

Also do you happen to know if the target folder support "last access time"? If like my computer it does not then the next step is on Cargo, if it does support it then I can make progress with Cargo-Sweep.

It's really unreliable on the current machines, and AFAIK making it reliable will slow things down a lot.

bors added a commit to rust-lang/cargo that referenced this issue Jan 16, 2019
touch some files when we use them

This is a small change to improve the ability for a third party subcommand to clean up a target folder. I consider this part of the push to experiment with out of tree GC, as discussed in #6229.

how it works?
--------

This updates the modification time of a file in each fingerprint folder and the modification time of the intermediate outputs every time cargo checks that they are up to date. This allows a third party subcommand to look at the modification time of the timestamp file to determine the last time a cargo invocation required that file. This is far more reliable then the current practices of looking at the `accessed` time. `accessed` time is not available or disabled on many operating systems, and is routinely set by arbitrary other programs.

is this enough to be useful?
--------

The current implementation of cargo sweep on master will automatically use this data with no change to the code. With this PR, it will work even on systems that do not update `accessed` time.

This also allows a crude script to clean some of the largest subfolders based on each files modification time.

is this worth adding, or should we just build `clean --outdated` into cargo?
------
I would love to see a `clean --outdated` in cargo! However, I think there is a lot of design work before we can make something good enough to deserve the cargo teams stamp of approval. Especially as an in tree version will have to work with many use cases some of witch are yet to be designed (like distributed builds). Even just including `cargo-sweep`s existing functionality opens a full bike shop about what arguments to take, and in what form (`cargo-sweep` takes a days argument, but maybe we should have a minutes or a ISO standard time or ...). This PR, or equivalent, allows out of tree experimentation with all different interfaces, and is basically required for any `LRU` based system. (For example [Crater](rust-lang/crater#346) wants a GC that cleans files in an `LRU` manner to maintain a target folder below a target size. This is not a use case that is widely enough needed to be worth adding to cargo but one supported by this PR.)

what are the downsides?
----

1. There are legitimate performance concerns about writing so many small files during a NOP build.
2. There are legitimate concerns about unnecessary wrights on read-only filesystems.
3. If we add this, and it starts seeing widespread use, we may be de facto stabilizing the folder structure we use. (This is probably true of any system that allows out of tree experimentation.)
4. This may not be an efficient way to store the data. (It does have the advantage of not needing different cargos to manipulate the same file. But if you have a better idea please make a suggestion.)
@Eh2406
Copy link
Contributor

Eh2406 commented Jan 21, 2019

The lru cleaning has a PR.

The PR to have Cargo maintain the mtime was merged, but was then put behind a feature flag as it broke the playground. @ehuss reports that the feature as implemented slows down the playground by ~11sec, and a more limited version (just enough for cargo-sweep to work) slows it down by 2sec. I think Crater is also using Docker and AWS in a similar way.

@pietroalbini
Copy link
Member Author

The PR to have Cargo maintain the mtime was merged, but was then put behind a feature flag as it broke the playground. ehuss reports that the feature as implemented slows down the playground by ~11sec, and a more limited version (just enough for cargo-sweep to work) slows it down by 2sec. I think Crater is also using Docker and AWS in a similar way.

Yep, we're basically using the exact same setup.
Thanks for all your effort on this by the way!

@aidanhs
Copy link
Member

aidanhs commented Jan 22, 2019

When I've suggested this previously it has been in combination with a proposal that we do some form of sorting of crates with their dependencies to reduce the impact of auto-cleaning (e.g. chances are you'll have finished a bunch of crates that then don't need rebuilding).

More interestingly, one could imagine selectively cleaning out dependencies when they're done (e.g. once you're done with serde version 1.0.9 you clean just that).

@Eh2406
Copy link
Contributor

Eh2406 commented Jan 23, 2019

On discord @aidanhs "I can say that crater doesn't put the target directory in the image so there's no CoW stuff. in fact I think the crater image is entirely read-only and used for linking libraries"

Which suggests that a quick experiment to see how big the overhead is with the existing flag may be worth it, if someone has the time.

I like the idea of a Crater graph aware solution, I think it can build on #193. I can volunteer to help in as much as it involves adding to the "target folder gc" ecosystem. (I am mostly interested in helping that, Crater is just an example at the moment.)

One thought I had is that if Crater is walking over a topological sort of the things to build and knows it is done with all the things it built before "foo" then it can use cargo-sweep with the exact time it built "foo". Or even just dell all files with a creation time before "foo".

I think cargo clean -p "serde v1.0.9" may be all the "gc" needed for your "more interesting" suggestion.

@Eh2406
Copy link
Contributor

Eh2406 commented Apr 23, 2019

discord: @pietroalbini at 12:21 PM

redeployed crater -- the target directory is now automatically being cleared
rm -rf target when we reach 90% disk usage
but that's the quickest solution I could get up and running fast, since we're going over 4tb now

#411

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-runner Area: the graph runner C-enhancement Category: enhancement to an existing feature
Projects
None yet
Development

No branches or pull requests

3 participants