-
-
Notifications
You must be signed in to change notification settings - Fork 632
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce explicit cache writing job in RscCompile task #8190
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Would you mind adding a test in https://github.com/pantsbuild/pants/blob/master/tests/python/pants_test/backend/jvm/tasks/jvm_compile/java/test_cache_compile_integration.py ?
if not hit_cache: | ||
counter_val = str(counter()).rjust(counter.format_length(), ' ') | ||
counter_str = '[{}/{}] '.format(counter_val, counter.size) | ||
self.context.log.info( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, let's skip the output here (or move it to log.debug
).
@@ -289,6 +289,9 @@ def _zinc_key_for_target(self, target, workflow): | |||
'rsc-and-zinc': lambda: 'zinc[rsc-and-zinc]({})'.format(target.address.spec), | |||
})() | |||
|
|||
def _cachewrite_key_for_target(self, target): | |||
return 'cachewrite({})'.format(target.address.spec) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return 'cachewrite({})'.format(target.address.spec) | |
return 'write_to_cache({})'.format(target.address.spec) |
@@ -386,6 +389,24 @@ def nonhermetic_digest_classpath(): | |||
# Update the products with the latest classes. | |||
self.register_extra_products_from_contexts([ctx.target], compile_contexts) | |||
|
|||
def work_for_vts_cachewrite(vts, ctx): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def work_for_vts_cachewrite(vts, ctx): | |
def work_for_vts_write_to_cache(vts, ctx): |
def work_for_vts_cachewrite(vts, ctx): | ||
# Double check the cache before beginning compilation | ||
hit_cache = self.check_cache(vts, counter) | ||
target = ctx.target |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is performed in multiple places in this method. Is it possible to factor out this check to reduce boilerplate (which matters since this is a very complex task)?
items_to_report_element([t.address.reference() for t in vts.targets], 'target'), | ||
' (', | ||
ctx.target.address.spec, | ||
').') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Python 3 allows f-strings, which make this type of formatting much easier to follow. Additionally, I believe the elements such as items_to_report_element(ctx.sources, '{} source'.format(self.name()))
are repeated elsewhere in this file as well. Is it possible to factor out that message templating into its own function?
(ftr: there are two legit failures in CI) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for diving in to write the test.
I have only nits, so don't bother fixing them unless you need another CI run anyway.
tests/python/pants_test/backend/jvm/tasks/jvm_compile/java/test_cache_compile_integration.py
Outdated
Show resolved
Hide resolved
Talked with Danny about this, and he's fine with merging. |
### Problem Rsc outlining and zinc compiles of a target will race to write to the cache. This will usually result in no zinc classes in the artifact due because zinc will hit the cache during the cache double-check, causing runtime classpath to be missing classes. ### Solution Create an explicit cache-write job dependent on the Rsc and zinc jobs completing, removing the cache write from those jobs. ### Result Artifacts for `rsc-and-zinc` targets will deterministically contain both Rsc and zinc classfiles as expected.
### Problem #8190 moved cache writing out of the completion of the zinc and rsc jobs and into a dependent job. But at the same time, we also had multiple attempts to "double check" the cache happening concurrently due to both the zinc and rsc jobs checking, and that race could lead to partial entries being extracted. ### Solution Since we can't actually cancel or coordinate the concurrent work, we can't safely double check the cache once either job has started. So instead, this change extracts the cache double-check into its own job that both the zinc and rsc tasks will depend on.
### Problem #8190 moved cache writing out of the completion of the zinc and rsc jobs and into a dependent job. But at the same time, we also had multiple attempts to "double check" the cache happening concurrently due to both the zinc and rsc jobs checking, and that race could lead to partial entries being extracted. ### Solution Since we can't actually cancel or coordinate the concurrent work, we can't safely double check the cache once either job has started. So instead, this change extracts the cache double-check into its own job that both the zinc and rsc tasks will depend on.
Problem
Rsc outlining and zinc compiles of a target will race to write to the cache.
This will usually result in no zinc classes in the artifact due because zinc
will hit the cache during the cache double-check, causing runtime classpath to
be missing classes.
Solution
Create an explicit cache-write job dependent on the Rsc and zinc jobs
completing, removing the cache write from those jobs.
Result
Artifacts for
rsc-and-zinc
targets will deterministically contain both Rsc andzinc classfiles as expected.