-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-36154: [JS][CI] Use jest
cache in CI
#36373
Conversation
Signed-off-by: abetomo <abe@enzou.tokyo>
.github/workflows/js.yml
Outdated
uses: actions/cache@v3 | ||
with: | ||
path: js/.jest-cache | ||
key: js-jest-cache-${{ runner.os }}-${{ matrix.node-version }}-${{ hashFiles('js/yarn.lock') }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
key: js-jest-cache-${{ runner.os }}-${{ matrix.node-version }}-${{ hashFiles('js/yarn.lock') }} | |
key: js-jest-cache-${{ runner.os }}-${{ matrix.node }}-${{ hashFiles('js/yarn.lock') }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, thanks. Fixed.
.github/workflows/js.yml
Outdated
path: js/.jest-cache | ||
key: js-jest-cache-${{ runner.os }}-${{ matrix.node-version }}-${{ hashFiles('js/yarn.lock') }} | ||
restore-keys: | | ||
js-jest-cache-${{ runner.os }}-${{ matrix.node-version }}- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
js-jest-cache-${{ runner.os }}-${{ matrix.node-version }}- | |
js-jest-cache-${{ runner.os }}-${{ matrix.node }}- |
.github/workflows/js.yml
Outdated
key: js-jest-cache-${{ runner.os }}-${{ matrix.node-version }}-${{ hashFiles('js/yarn.lock') }} | ||
restore-keys: | | ||
js-jest-cache-${{ runner.os }}-${{ matrix.node-version }}- | ||
js-jest-cache-${{ runner.os }}- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this?
(Can we reuse cache for different Node.js version?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cache is reused even if different Node.js versions are used.
I have confirmed this by switching Node.js versions locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then, can we remove -${{ matrix.node }}
from key
and restore-keys
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will you be testing on more than one node
in the future?
(like node: [18, 20]
.)
I just tested a few times locally and it seemed a little faster to use the cache of the same Node.js version.
It is certainly unnecessary if you don't test with multiple node
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll not use multiple Node.js versions unless we reduce CI time. (40+ minutes is too long to run multiple jobs. If we want to test against multiple Node.js versions, we'll use nightly CI that are defined in dev/tasks/tasks.yml
.)
If the difference between with/without -${{ matrix.node }}
, we want to unify it to reduce cache size. apache/arrow has multiple language implementations. So cache limit provided by GitHub Actions is small for us. So we want to reduce cache size as much as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got it.
I removed matrix.node
.
.github/workflows/js.yml
Outdated
key: js-jest-cache-${{ runner.os }}-${{ matrix.node-version }}-${{ hashFiles('js/yarn.lock') }} | ||
restore-keys: | | ||
js-jest-cache-${{ runner.os }}-${{ matrix.node-version }}- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
key: js-jest-cache-${{ runner.os }}-${{ matrix.node-version }}-${{ hashFiles('js/yarn.lock') }} | |
restore-keys: | | |
js-jest-cache-${{ runner.os }}-${{ matrix.node-version }}- | |
key: js-jest-cache-${{ runner.os }}-${{ matrix.node }}-${{ hashFiles('js/yarn.lock') }} | |
restore-keys: | | |
js-jest-cache-${{ runner.os }}-${{ matrix.node }}- |
Great. Do you have some before and after measurements? |
Signed-off-by: abetomo <abe@enzou.tokyo>
@domoritz Thanks for the review. |
Signed-off-by: abetomo <abe@enzou.tokyo>
Great. The times seem faster although not by as much as I expected. Are we sure that caching doesn’t introduce issues where tests are stale? What is in the cache? |
https://jestjs.io/docs/code-transformation
Jest caches transformed codes. The documentation notes the following.
So, it is not always possible to say that problems will not occur. |
So should we use the cache across lock files as well then? |
Do you mean to cache |
No, caching node modules is a separate issue and I don't think it'll help much. What I mean is that in https://github.com/apache/arrow/pull/36373/files#diff-9992b6f0e329a6dc49913c62f6efe2102f7a64f55d4bdab3d47df86292db5e87R97 we key by the lockfile but if the jest cache is automatically only used when its applicable, we can remove the restriction to specific node and lockfile versions. |
At first, I too thought that the key was |
Why would it be an issue if the cache is not updated? It feels odd to update based on the lock file rather than some other test related property. |
I agree that using a lock file is not the best way to do this. If the cache is not updated, the cache will not be usable and the speed improvement will be less effective. |
What are the inputs for jest's cache? We should use a hash value for them in |
What we really want is a way to use a cache and still update it. Anything else will not catch many cases and the CI will still time out. |
Since the cache for jest is a transformed code, it is created from the code in |
Then, we can use We should not use |
This is not possible due to the way actions/cache works once a cache with a certain key is created it is immutable. But using the ts file hashes in the key should work as you intend imo. (unless dependencies also come into play than you might want to ad the lock file as kou said) |
Makes sense that caches are immutable. Do we have a way to check whether the cache is used? If it’s used even if dependencies change, I think we should not include the lock file in the hash. |
I am not sure what you mean with if the cache is used? Used by jest? I don't know @abetomo? Though actions/cache sets an output value if there was a cache hit or not, if that is what you mean. |
An existing cache is used even when dependencies are only changed because we have |
What I mean is that I’d like to make sure we know when the jest cache is actually used by jest or whether jest ignored it because the cache wasn’t applicable to the code. |
The cache is generated in There does not seem to be any option in jest to more easily know if the cache is used or not. |
I tried it and when the package related to the code transform changes (when the lock file is updated), jest cache is also updated. What about the key |
Signed-off-by: abetomo <abe@enzou.tokyo>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
I'll merge in a few days unless nobody objects this.
Thanks for the merge. |
After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit b07e3f1.There were 3 benchmark results indicating a performance regression:
The full Conbench report has more details. |
Rationale for this change
Use the cache of
test
to reduce test time.What changes are included in this PR?
Use
jest
cache withactions/cache@v3
.Are these changes tested?
In GitHub Actions, test time was reduced.
Are there any user-facing changes?
macos
in GitHub Actions #36154