Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

relying on source code modification time leads to a stale precompilation cache hit #42251

Closed
colinxs opened this issue Sep 15, 2021 · 1 comment

Comments

@colinxs
Copy link

colinxs commented Sep 15, 2021

Nix is a package manager that strives for reproducible builds. One of the way it achieves that is by setting the modification time for all build products to the start of the Unix epoch. This conflicts with Julia's code loading. From the manual:

For file dependencies, a change is determined by examining whether the modification time (mtime) of each file loaded by include or added explicitly by include_dependency is unchanged, or equal to the modification time truncated to the nearest second (to accommodate systems that can't copy mtime with sub-second accuracy)

This results in the following bug:

  1. Build a Julia package, Example, with Nix. This gets installed to /nix/store/path-to-first-build.
  2. Run julia --project=/nix/store/path-to-first-build -e 'using Example; Example.greet()'
  3. Modify the source code of Example
  4. Build it again. This build lives at /nix/store/path-to-second-build.
  5. Run julia --project=/nix/store/path-to-second-build -e 'using Example; Example.greet()'. Julia sees that the mtime for the source under /nix/store/path/to-second-build is the same as /nix/store/path-to-first-build and uses the precompiled cache generated in (2).

I originally filed an issue at NixOS/nixpkgs#137926 but as this affects multiple build systems (Nix and Guix off the top of my head, as well as as any other build system that obeys SOURCE_DATE_EPOCH), it seems the fix should live here in the upstream repo. You can find a more extensive description of the bug and minimal code example over at that issue.

I believe the offending source code that leads to this bug is:

if ftime != ftime_req && ftime != floor(ftime_req) && ftime != trunc(ftime_req, digits=6)

Previously we had this patch in nixpkgs to get around this bug, but due to problems compiling Julia from source we switched to the official precompiled binaries and so can no longer patch the source.

This patch is fairly Nix (and possibly Guix) specific as it hardcodes a check for ftime != 1.0. A more flexible/comprehensive check might be to compare hashes and if they are equal compare contents. On a laptop with a i9-10885H running sha256sum on a 1 million LOC/100 MB file takes about 450ms. xxhash takes only 22ms by comparsion. I've yet to run across a Julia package even remotely that large but I know time-to-first-plot is a sore subject for some folks :).

@colinxs
Copy link
Author

colinxs commented Feb 4, 2022

Looks like this was fixed in #43090. 👍

@colinxs colinxs closed this as completed Feb 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant