-
-
Notifications
You must be signed in to change notification settings - Fork 646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Test caching #4587
Comments
Some notes on invalidation vs caching. Here invalidation refers to executing task work in
A compromise might be to only cache when
In this scenario, Jake will likely be a CI process, in which case human others will see a cached result from Jane's commit. In code form, is this approach generically advisable? class AnyTask(Task):
def check_artifact_cache_for(self, invalidation_check):
return VersionedTargetSet.from_versioned_targets(invalidation_check.all_vts)
def execute(self):
# Other strategies could be used here, the key bit under consideration are the cases
# where the resulting partition has more than one actionable target for this task.
partition = self.context.targets()
with self.invalidated(partition) as invalidation_check:
invalid_targets = [invalid_target
for vts in invalidation_check.invalid_vts
for invalid_target in vts.targets]
results_dir = self._execute_task_work(invalid_targets)
if invalidation_check.all_vts == invalidation_check.invalid_vts:
# We executed against all targets in the partition
# successfully (`_execute_task_work` did not raise), so cache data files for
# others to see.
cache_vts = self.check_artifact_cache_for(invalidation_check)
self._copy_results(results_dir, cache_vts.results_dir)
else:
# We don't cache results; so others will need to re-run this partition.
# NB: We will presumably commit this change now though and so others will get this
# partition in a state that executes successfully; so when the 1st of the others
# executes against this partition; they will hit `all_vts == invalid_vts` and
# cache the results. That 1st of others is hopefully CI!
pass |
To me, the compromise sounds like it would be the sanest first step. Internally for buildcache, we do something similar where we only write artifacts to the cache from a CI job and read from everywhere else (laptops, CI). This ensures repeatability and then abstracts us from the issues of different configurations/environments or just unreliability in caching results of intermediate work being performed. So its comes into play only if you are building/running code thats already in master. Having said that, this sounds like it would provide little to no benefit while iterating on tests and would only benefit me in CI or when I am running tests locally (with no changes). A thought as a next step, what if we looked at invalidating just the target roots when there is a change so |
It is perenially surprising to me that running |
I'd be strongly opposed to doing any merging or splitting of outputs, so I think running only the invalid tests in a partition should not be a goal. If that means that we can only actually utilize the cache when there are lots of partitions (cache usage would be optimal when So, I think your example works verbatim, with the exception that there are multiple partitions involved?
EDIT: Actually, trying to read a bit further between the lines here, I think you're probably pointing to the fact that both Ivy and Pex are using the entire target context, and that I had forgotten that this would be an issue with test caching. The issue is essentially that target fingerprints are fingerprints of the content on disk, rather than a fingerprint of the resulting products (ie, the inputs to the test run). Looking at this issue again after a while, I wonder whether a better strategy for achieving test caching might be to skip attempting to use the existing fingerprinting/cache mechanism, and double-down on what we have planned for remoting. Namely, we would begin implementing what we had planned there in terms of invoking the pytest/junit processes in isolated process sandboxes, and fingerprinting all of their inputs with the API described here. |
More concretely, the exact issues would be #4586, #4585, and then a new issue to invoke EDIT: As a temporary alternative, doing what ResolvedJarAwareFingerprintStrategy does for Ivy for pex requirements and using it in the pytest runner might work. It's clearly not the long term solution since it's a whitelist of things to inputs to fingerprint rather than a blacklist, but it might be sufficient here. Given that both EDIT2: @kwlzn pointed out that implementing what you proposed and only supporting caching in a very specific case ("opportunistic, root-level caching on the task side": ie, when We'll work on getting a more complete definition of #4586 (or perhaps a followup?) posted, which will describe executing pex invokes in the engine as described above. |
Just to be sure I understand the problem, the issue is that what you need to cache is not just a pass/fail bit for each target, but other state, such as coverage? What exactly is the current universe of such "other state"? |
@benjyw : This is the root of the complexity, IMO:
In essence: because pex and ivy look at "the entire context", and then create products that later tasks will consume, just looking at the transitive deps for any particular target is insufficient to get an accurate cache key (because something outside your transitive deps might have caused a different pex/ivy resolution). |
@ity, re:
As a local developer, this strategy provides the status-quo for iteration since the invalidation (not caching) machinery is still in play. The only upside offered beyond the status quo is on cache hits (usually from CI runs) that would no-op or speed up the cold case. This strategy takes the conservative tack and does no "math" such that non-identical target sets; ie @agenticarus agreed @stuhood lots of edits there, and they all revolve around thinking more deeply about how tasks behave - namely the global resolution tasks. Per my response to Ity above, I'd love to ignore those problematic deep behaviors and setup caching initially to do the only obvious safe thing. @benjyw coverage and the junit xml (some folks point CI's at these) for this specific py.test case |
Noting the results of an offline meeting re this discussion to-date. It's agreed that the minimal correct caching strategy makes sense to apply for now to both pytest and junit test runners. The next step after that is completed will be #4676 |
@jsirois hey, not sure if you're aware but the participant here is "benjyw" not "benjy" and I keep getting pinged for this issue. :) |
@ity I'm going to close this since I think the motivating cases - python and jvm - are now covered. That's not to say they're aren't problems with using test caching in the real world today, for example as pointed out by #4780 wrt loose test file access. There may be some follow-up refactors to extract commonality in the |
Once #4586 lands, devise a strategy for the implementation of test caching.
There are at least 2 layers of caching - by means of analysis files from zinc. Or if not found locally, then being able to invoke RemoteProcessExecution.
As step 1, need to find a way of running locally if remote execution doesn't exist or is not hooked up yet.
The text was updated successfully, but these errors were encountered: