ExecOp: update content-cache mount logic #4624

jedevc · 2024-02-05T16:21:56Z

See discussion in Slack: https://dockercommunity.slack.com/archives/C7S7A40MP/p1706711391769259

This is split into 4 commits for readability:

exec: refactor content-based cache detection
- Refactors content-based cache logic to be more readable, and add context comments from Readwrite option for bind mount invalidates cache #903 (comment).
exec: allow content-cache for root selected mounts (with @sipsma for coming up with the idea)
- Allows for root-selected mounts to be content-cached
exec: allow caller-controlled content-based cache
- Allows LLB-controlled cache fields to dictate content-based cache. Each mount can be enabled/disabled/left at the default. Some mounts cannot have it enabled for safety reasons, and this should fail.
test: add new content-cache exec mount tests
- This adds detailed tests for all the old+new behavior, explicitly listing all the cases we really actually care about.

tonistiigi

I think this requires a new cap as well.

solver/llbsolver/ops/exec.go

This refactors the content-based cache to be that little bit tidier. In addition to adding comments that explain *why* we're even bothering, this restructures the code to avoid being unclear. To explain the changes in a little more detail (since it's not abundantly clear why this translation is valid), the initial condition looks like: if (!m.Readonly || m.Dest == pb.RootMount) && m.Output != -1 { deps[m.Input].NoContentBasedHash = true We can apply De Morgan's law recursively to invert the condition and the result: deps[m.Input].NoContentBasedHash = true if (m.Readonly && m.Dest != pb.RootMount) || m.Output == -1 { deps[m.Input].NoContentBasedHash = false With all the juggling of NoContentBasedCache, we invert the variable name to be ContentBasedCache (and invert everywhere it's used as well): if (m.Readonly && m.Dest != pb.RootMount) || m.Output == -1 { deps[m.Input].ContentBasedHash = true This reads a bit easier, but now we split this into two separate branches for readability (and so we can comment each one in more detail separately): if m.Readonly && m.Dest != pb.RootMount { deps[m.Input].ContentBasedHash = true } if m.Output == -1 { deps[m.Input].ContentBasedHash = true } While this has been the behavior for ages, I think it makes sense to deliberately this behavior slightly. It doesn't make sense to me that we should only disallow read-only root mounts, but no-output root mounts are allowed - there's no reason these shouldn't behave identically, by splitting these out: if m.Readonly { deps[m.Input].ContentBasedHash = true } if m.Output == -1 { deps[m.Input].ContentBasedHash = true } if m.Dest == pb.RootMount { deps[m.Input].ContentBasedHash = false } There is a small chance that this is a breaking change for some users, however, 1. SkipOutput (-1) is very rare and not often used in the wild (except for dockerfiles, where it's only used for non-root mounts), and 2. will only cause a cache miss. Signed-off-by: Justin Chadwell <me@jedevc.com>

These mounts are actually safe, as suggested by Erik on slack: > Is it correct that this wouldn’t be a problem in the case where the > selector of the mount is just “/“? Because then there’s no “hidden” > files. > > If so, maybe there’s a path to enabling content cache for rw mounts > that are from “/“. Then you could also get the same end behavior by > not using selectors and instead always copying the subdir you want to > mount to scratch. This patch adds a check for this case, and explicitly enables content-based cache for these cases. Co-authored-by: Erik Sipsma <erik@sipsma.dev> Signed-off-by: Justin Chadwell <me@jedevc.com>

This allows LLB-directed content-based cache enablement for each mount. Some mounts may not be explicitly unabled (because it would be unsafe) - for these cases we explicitly error out. Signed-off-by: Justin Chadwell <me@jedevc.com>

These test all of the new behavior: - Checks for old default no content cache - Checks for old read-only and no-output allowed content cache - Checks for new root selector allowed content cache - Checks for new caller options that allow enabling/disabling it Signed-off-by: Justin Chadwell <me@jedevc.com>

jedevc · 2024-02-06T11:46:29Z

I think this requires a new cap as well.

I think this should do the job? https://github.com/jedevc/buildkit/blob/control-mount-content-cache/solver/pb/caps.go#L60

jedevc requested review from tonistiigi and sipsma February 5, 2024 16:21

jedevc linked an issue Feb 5, 2024 that may be closed by this pull request

WithMountedDirectory/WithMountedFile does not use content-based cache in withExec dagger/dagger#6421

Open

tonistiigi reviewed Feb 6, 2024

View reviewed changes

solver/llbsolver/ops/exec.go Show resolved Hide resolved

jedevc and others added 4 commits February 6, 2024 11:18

exec: allow caller-controlled content-based cache

0eb25a6

This allows LLB-directed content-based cache enablement for each mount. Some mounts may not be explicitly unabled (because it would be unsafe) - for these cases we explicitly error out. Signed-off-by: Justin Chadwell <me@jedevc.com>

jedevc force-pushed the control-mount-content-cache branch from fa2435c to ed2efe3 Compare February 6, 2024 11:44

tonistiigi approved these changes Feb 6, 2024

View reviewed changes

tonistiigi merged commit aa8d9ef into moby:master Feb 7, 2024
63 checks passed

tonistiigi mentioned this pull request Aug 13, 2024

exec: fix incorrect deps computation for special mounts #5251

Merged

crazy-max mentioned this pull request Sep 20, 2024

unexpected metrics for BenchmarkBuildSecret moby/buildkit-bench#124

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ExecOp: update content-cache mount logic #4624

ExecOp: update content-cache mount logic #4624

jedevc commented Feb 5, 2024

tonistiigi left a comment

jedevc commented Feb 6, 2024

ExecOp: update content-cache mount logic #4624

ExecOp: update content-cache mount logic #4624

Conversation

jedevc commented Feb 5, 2024

tonistiigi left a comment

Choose a reason for hiding this comment

jedevc commented Feb 6, 2024