Skip to content

Commit

Permalink
Auto merge of #6477 - Eh2406:add-a-timestamp-file, r=ehuss
Browse files Browse the repository at this point in the history
touch some files when we use them

This is a small change to improve the ability for a third party subcommand to clean up a target folder. I consider this part of the push to experiment with out of tree GC, as discussed in #6229.

how it works?
--------

This updates the modification time of a file in each fingerprint folder and the modification time of the intermediate outputs every time cargo checks that they are up to date. This allows a third party subcommand to look at the modification time of the timestamp file to determine the last time a cargo invocation required that file. This is far more reliable then the current practices of looking at the `accessed` time. `accessed` time is not available or disabled on many operating systems, and is routinely set by arbitrary other programs.

is this enough to be useful?
--------

The current implementation of cargo sweep on master will automatically use this data with no change to the code. With this PR, it will work even on systems that do not update `accessed` time.

This also allows a crude script to clean some of the largest subfolders based on each files modification time.

is this worth adding, or should we just build `clean --outdated` into cargo?
------
I would love to see a `clean --outdated` in cargo! However, I think there is a lot of design work before we can make something good enough to deserve the cargo teams stamp of approval. Especially as an in tree version will have to work with many use cases some of witch are yet to be designed (like distributed builds). Even just including `cargo-sweep`s existing functionality opens a full bike shop about what arguments to take, and in what form (`cargo-sweep` takes a days argument, but maybe we should have a minutes or a ISO standard time or ...). This PR, or equivalent, allows out of tree experimentation with all different interfaces, and is basically required for any `LRU` based system. (For example [Crater](rust-lang/crater#346) wants a GC that cleans files in an `LRU` manner to maintain a target folder below a target size. This is not a use case that is widely enough needed to be worth adding to cargo but one supported by this PR.)

what are the downsides?
----

1. There are legitimate performance concerns about writing so many small files during a NOP build.
2. There are legitimate concerns about unnecessary wrights on read-only filesystems.
3. If we add this, and it starts seeing widespread use, we may be de facto stabilizing the folder structure we use. (This is probably true of any system that allows out of tree experimentation.)
4. This may not be an efficient way to store the data. (It does have the advantage of not needing different cargos to manipulate the same file. But if you have a better idea please make a suggestion.)
  • Loading branch information
bors committed Jan 16, 2019
2 parents 9b5d4b7 + 97363ca commit 513d230
Show file tree
Hide file tree
Showing 3 changed files with 191 additions and 2 deletions.
18 changes: 16 additions & 2 deletions src/cargo/core/compiler/fingerprint.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ use std::fs;
use std::hash::{self, Hasher};
use std::path::{Path, PathBuf};
use std::sync::{Arc, Mutex};
use std::time::SystemTime;

use filetime::FileTime;
use log::{debug, info};
Expand Down Expand Up @@ -88,6 +89,7 @@ pub fn prepare_target<'a, 'cfg>(

let root = cx.files().out_dir(unit);
let missing_outputs = {
let t = FileTime::from_system_time(SystemTime::now());
if unit.mode.is_doc() {
!root
.join(unit.target.crate_name())
Expand All @@ -98,8 +100,15 @@ pub fn prepare_target<'a, 'cfg>(
.outputs(unit)?
.iter()
.filter(|output| output.flavor != FileFlavor::DebugInfo)
.find(|output| !output.path.exists())
{
.find(|output| {
if output.path.exists() {
// update the mtime so other cleaners know we used it
let _ = filetime::set_file_times(&output.path, t, t);
false
} else {
true
}
}) {
None => false,
Some(output) => {
info!("missing output path {:?}", output.path);
Expand Down Expand Up @@ -681,6 +690,11 @@ pub fn dep_info_loc<'a, 'cfg>(cx: &mut Context<'a, 'cfg>, unit: &Unit<'a>) -> Pa

fn compare_old_fingerprint(loc: &Path, new_fingerprint: &Fingerprint) -> CargoResult<()> {
let old_fingerprint_short = paths::read(loc)?;

// update the mtime so other cleaners know we used it
let t = FileTime::from_system_time(SystemTime::now());
filetime::set_file_times(loc, t, t)?;

let new_hash = new_fingerprint.hash();

if util::to_hex(new_hash) == old_fingerprint_short {
Expand Down
172 changes: 172 additions & 0 deletions tests/testsuite/freshness.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
use std::fs::{self, File, OpenOptions};
use std::io::prelude::*;
use std::net::TcpListener;
use std::path::PathBuf;
use std::thread;
use std::time::SystemTime;

use crate::support::paths::CargoPathExt;
use crate::support::registry::Package;
Expand Down Expand Up @@ -1178,6 +1180,176 @@ fn changing_rustflags_is_cached() {
.run();
}

fn simple_deps_cleaner(mut dir: PathBuf, timestamp: filetime::FileTime) {
// Cargo is experimenting with letting outside projects develop some
// limited forms of GC for target_dir. This is one of the forms.
// Specifically, Cargo is updating the mtime of files in
// target/profile/deps each time it uses the file.
// So a cleaner can remove files older then a time stamp without
// effecting any builds that happened since that time stamp.
let mut cleand = false;
dir.push("deps");
for dep in fs::read_dir(&dir).unwrap() {
let dep = dep.unwrap();
if filetime::FileTime::from_last_modification_time(&dep.metadata().unwrap()) <= timestamp {
fs::remove_file(dep.path()).unwrap();
println!("remove: {:?}", dep.path());
cleand = true;
}
}
assert!(
cleand,
"called simple_deps_cleaner, but there was nothing to remove"
);
}

#[test]
fn simple_deps_cleaner_dose_not_rebuild() {
let p = project()
.file(
"Cargo.toml",
r#"
[package]
name = "foo"
version = "0.0.1"
[dependencies]
bar = { path = "bar" }
"#,
)
.file("src/lib.rs", "")
.file("bar/Cargo.toml", &basic_manifest("bar", "0.0.1"))
.file("bar/src/lib.rs", "")
.build();

p.cargo("build").run();
p.cargo("build")
.env("RUSTFLAGS", "-C target-cpu=native")
.with_stderr(
"\
[COMPILING] bar v0.0.1 ([..])
[COMPILING] foo v0.0.1 ([..])
[FINISHED] dev [unoptimized + debuginfo] target(s) in [..]",
)
.run();
if is_coarse_mtime() {
sleep_ms(1000);
}
let timestamp = filetime::FileTime::from_system_time(SystemTime::now());
if is_coarse_mtime() {
sleep_ms(1000);
}
// This dose not make new files, but it dose update the mtime.
p.cargo("build")
.env("RUSTFLAGS", "-C target-cpu=native")
.with_stderr("[FINISHED] dev [unoptimized + debuginfo] target(s) in [..]")
.run();
simple_deps_cleaner(p.target_debug_dir(), timestamp);
// This should not recompile!
p.cargo("build")
.env("RUSTFLAGS", "-C target-cpu=native")
.with_stderr("[FINISHED] dev [unoptimized + debuginfo] target(s) in [..]")
.run();
// But this should be cleaned and so need a rebuild
p.cargo("build")
.with_stderr(
"\
[COMPILING] bar v0.0.1 ([..])
[COMPILING] foo v0.0.1 ([..])
[FINISHED] dev [unoptimized + debuginfo] target(s) in [..]",
)
.run();
}

fn fingerprint_cleaner(mut dir: PathBuf, timestamp: filetime::FileTime) {
// Cargo is experimenting with letting outside projects develop some
// limited forms of GC for target_dir. This is one of the forms.
// Specifically, Cargo is updating the mtime of a file in
// target/profile/.fingerprint each time it uses the fingerprint.
// So a cleaner can remove files associated with a fingerprint
// if all the files in the fingerprint's folder are older then a time stamp without
// effecting any builds that happened since that time stamp.
let mut cleand = false;
dir.push(".fingerprint");
for fing in fs::read_dir(&dir).unwrap() {
let fing = fing.unwrap();

if fs::read_dir(fing.path()).unwrap().all(|f| {
filetime::FileTime::from_last_modification_time(&f.unwrap().metadata().unwrap())
<= timestamp
}) {
fs::remove_dir_all(fing.path()).unwrap();
println!("remove: {:?}", fing.path());
// a real cleaner would remove the big files in deps and build as well
// but fingerprint is sufficient for our tests
cleand = true;
} else {
}
}
assert!(
cleand,
"called fingerprint_cleaner, but there was nothing to remove"
);
}

#[test]
fn fingerprint_cleaner_dose_not_rebuild() {
let p = project()
.file(
"Cargo.toml",
r#"
[package]
name = "foo"
version = "0.0.1"
[dependencies]
bar = { path = "bar" }
"#,
)
.file("src/lib.rs", "")
.file("bar/Cargo.toml", &basic_manifest("bar", "0.0.1"))
.file("bar/src/lib.rs", "")
.build();

p.cargo("build").run();
p.cargo("build")
.env("RUSTFLAGS", "-C target-cpu=native")
.with_stderr(
"\
[COMPILING] bar v0.0.1 ([..])
[COMPILING] foo v0.0.1 ([..])
[FINISHED] dev [unoptimized + debuginfo] target(s) in [..]",
)
.run();
if is_coarse_mtime() {
sleep_ms(1000);
}
let timestamp = filetime::FileTime::from_system_time(SystemTime::now());
if is_coarse_mtime() {
sleep_ms(1000);
}
// This dose not make new files, but it dose update the mtime.
p.cargo("build")
.env("RUSTFLAGS", "-C target-cpu=native")
.with_stderr("[FINISHED] dev [unoptimized + debuginfo] target(s) in [..]")
.run();
fingerprint_cleaner(p.target_debug_dir(), timestamp);
// This should not recompile!
p.cargo("build")
.env("RUSTFLAGS", "-C target-cpu=native")
.with_stderr("[FINISHED] dev [unoptimized + debuginfo] target(s) in [..]")
.run();
// But this should be cleaned and so need a rebuild
p.cargo("build")
.with_stderr(
"\
[COMPILING] bar v0.0.1 ([..])
[COMPILING] foo v0.0.1 ([..])
[FINISHED] dev [unoptimized + debuginfo] target(s) in [..]",
)
.run();
}

#[test]
fn reuse_panic_build_dep_test() {
let p = project()
Expand Down
3 changes: 3 additions & 0 deletions tests/testsuite/support/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1603,6 +1603,9 @@ pub fn sleep_ms(ms: u64) {

/// Returns true if the local filesystem has low-resolution mtimes.
pub fn is_coarse_mtime() -> bool {
// If the filetime crate is being used to emulate HFS then
// return true, without looking at the actual hardware.
cfg!(emulate_second_only_system) ||
// This should actually be a test that $CARGO_TARGET_DIR is on an HFS
// filesystem, (or any filesystem with low-resolution mtimes). However,
// that's tricky to detect, so for now just deal with CI.
Expand Down

0 comments on commit 513d230

Please sign in to comment.