-
-
Notifications
You must be signed in to change notification settings - Fork 510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI Build & Test: First build incrementally and test changed files only #35652
Conversation
.github/workflows/build.yml
Outdated
git config --global user.email "ci-sage@example.com" | ||
git config --global user.name "Build & Test workflow" | ||
# If actions/checkout downloaded using the GitHub REST API: | ||
if [ ! -d .git ]; then git config --global --add safe.directory $(pwd) && git init && git add -A && git commit --quiet -m "new"; fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of this git acrobatics, why not get the list of changed files using say https://github.com/marketplace/actions/get-all-changed-files and then run the tests on each of them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not just for finding the changed files.
This trick makes the build of the Sage library actually incremental!
.github/workflows/build.yml
Outdated
eval $(sage-print-system-package-command auto update) | ||
eval $(sage-print-system-package-command auto --spkg --yes --no-install-recommends install git) | ||
|
||
- name: Incremental build and test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this coming before the actual build/is this not building things twice now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could, in fact, also run the main build on the incrementally updated source tree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this "incremental" build, what is build if one changes a python file, a cython file and say something in the build infrastructure (say a package update and a change in setup.py)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The git acrobatics has exactly the same effect as doing a "git checkout" to the new ref in a full build of the Sage distribution on a developer's computer: The timestamps (mtime) of unmodified files are kept the same, and the timestamps of modified files are newer than any file that was there before.
It is looking so complicated because the source tree in our Docker images is actually not a git working copy. This is why I have to turn it surgically into a working copy (without messing up the timestamps).
Running make build
results in an incremental build: For spkgs, if the declared package version is changed, then (because they no longer match the installation records in {local,venv}/var/libs/sage/installed
these packages are reinstalled.
For the Sage library, the normal in-tree editable build is run, which is incremental (only the modified Cython files are cythonized and the result compiled). Because we are using editable mode, nothing interesting is happening with the modified Python files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the details!
But doesn't this mean that sometimes the incremental build (+ test) fails? At least locally, I sometimes have to hard-reset everything and run the whole bootstrap + configure + make cycle. In this cases, the workflow would fail, right? I'm not sure if this then more confusing than helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my experience, incremental builds are very robust, with the following exceptions:
- the docbuild process
- after indiscriminate updates of system packages (we don't do that here on this workflow)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By "hard reset", do you mean that you do "make distclean" (= remove local and venv)?
Sometimes, but most often I only have to run bootstrap & configure again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my experience, incremental builds are very robust, with the following exceptions:
Okay, then I let someone else with more experience on the build process have a look at this PR. Since this is the most important build workflow, it should only fail for the correct reason and not because of some general issues related to the build process.
If the incremental build is not (yet) reliable enough, I would favor a normal build + test on changed files + long test; and/or extracting the incremental build to a new workflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By "hard reset", do you mean that you do "make distclean" (= remove local and venv)?
Sometimes, but most often I only have to run bootstrap & configure again.
Ah OK. We can certainly run bootstrap
manually before doing the incremental build. I'll make this change.
(configure definitely does not have to be run manually.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the incremental build is not (yet) reliable enough, I would favor a normal build + test on changed files + long test; and/or extracting the incremental build to a new workflow.
@tobiasdiez @kwankyu I have now implemented a fallback to non-incremental when the incremental build fails – both for build and for docbuild.
a82f6a1
to
f9b4412
Compare
This works well in the step "Incremental build and test", but in the step "Configure new tree" fails with
and then in the step "Build and test modularized distributions" fails with
|
…emental build (fixup)
Thanks. I am testing it again. So far so good. Tobias' concern remains. I think incremental build (except doc building) is reliable, but we would not know if it (the concern) is real before we actually deploy incremental build for testing. It is worth to try! |
In my own testing of this PR and also here, we have failure
|
"Build & Test" ran the build workflow of this PR. How is this possible? |
Strange indeed. I'll restart the workflow to see if this is deterministic |
This seems just normal. GitHub Actions runs workflow on |
See https://doc.sagemath.org/html/en/developer/packaging.html#package-dependencies I added this when I added |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
Thank you! |
Documentation preview for this PR (built with commit 908ad71) is ready! 🎉 |
Isn't this so designed that if there are doctest failures in the initial incremental build, it is reported and the B&T stops? |
is it not working as expected? |
See this #35809 |
It is not working as I expected. But perhaps it's working as it was designed? |
In https://github.com/sagemath/sage/actions/runs/5349883585/jobs/9701778982?pr=35809 it is reported by a red X in the step. Are you saying that the final result is a green checkmark but it should be a red X? |
Yes. I expect that the B&T itself stops with red X right after the red X in the step. |
so that people can recognize the failure asap. |
It is okay if it was so designed that the full test always runs. It is an improvement. But It seems a little inconvenient that people need to click the B&T to recognize the doctest failures in the incremental build while the full B&T will take much longer because of |
I did design it so, but you have a very good point. |
I'll be happy to change this, but I first have to take care of this strange bug: |
…html <!-- Please provide a concise, informative and self-explanatory title. --> <!-- Don't put issue numbers in the title. Put it in the Description below. --> <!-- For example, instead of "Fixes #12345", use "Add a new method to multiply two integers" --> ### 📚 Description This new feature from #35652 was supposed to show the changed HTML files, but a last minute change there destroyed it. Here we fix it. <!-- Describe your changes here in detail. --> <!-- Why is this change required? What problem does it solve? --> <!-- If this PR resolves an open issue, please link to it here. For example "Fixes #12345". --> <!-- If your change requires a documentation PR, please link it appropriately. --> ### 📝 Checklist <!-- Put an `x` in all the boxes that apply. It should be `[x]` not `[x ]`. --> - [x] The title is concise, informative, and self-explanatory. - [ ] The description explains in detail what this PR is about. - [ ] I have linked a relevant issue or discussion. - [ ] I have created tests covering the changes. - [ ] I have updated the documentation accordingly. ### ⌛ Dependencies <!-- List all open PRs that this PR logically depends on - #12345: short description why this is a dependency - #34567: ... --> <!-- If you're unsure about any of these, don't hesitate to ask. We're here to help! --> URL: #35808 Reported by: Matthias Köppe Reviewer(s): Kwankyu Lee, Matthias Köppe
<!-- Please provide a concise, informative and self-explanatory title. --> <!-- Don't put issue numbers in the title. Put it in the Description below. --> <!-- For example, instead of "Fixes #12345", use "Add a new method to multiply two integers" --> ### 📚 Description <!-- Describe your changes here in detail. --> The CI change in #35652 did not handle the case of a PR that adds files correctly: the fallback to the non-incremental build would delete these files. We fix it here. <!-- Why is this change required? What problem does it solve? --> <!-- If this PR resolves an open issue, please link to it here. For example "Fixes #12345". --> Fixes <!-- If your change requires a documentation PR, please link it appropriately. --> ### 📝 Checklist <!-- Put an `x` in all the boxes that apply. It should be `[x]` not `[x ]`. --> - [x] The title is concise, informative, and self-explanatory. - [x] The description explains in detail what this PR is about. - [x] I have linked a relevant issue or discussion. - [ ] I have created tests covering the changes. - [ ] I have updated the documentation accordingly. ### ⌛ Dependencies <!-- List all open PRs that this PR logically depends on - #12345: short description why this is a dependency - #34567: ... --> <!-- If you're unsure about any of these, don't hesitate to ask. We're here to help! --> URL: #35865 Reported by: Matthias Köppe Reviewer(s): Dima Pasechnik
<!-- ^^^^^ Please provide a concise, informative and self-explanatory title. Don't put issue numbers in there, do this in the PR body below. For example, instead of "Fixes sagemath#1234" use "Introduce new method to calculate 1+1" --> <!-- Describe your changes here in detail --> Reworking sagemath#35652 as discussed in sagemath#36469 (comment) Also applying the changes made in sagemath#35866 (cosmetic improvements to the Actions logs of doc-build.yml) to doc-build-pdf.yml <!-- Why is this change required? What problem does it solve? --> <!-- If this PR resolves an open issue, please link to it here. For example "Fixes sagemath#12345". --> <!-- If your change requires a documentation PR, please link it appropriately. --> ### 📝 Checklist <!-- Put an `x` in all the boxes that apply. --> <!-- If your change requires a documentation PR, please link it appropriately --> <!-- If you're unsure about any of these, don't hesitate to ask. We're here to help! --> <!-- Feel free to remove irrelevant items. --> - [x] The title is concise, informative, and self-explanatory. - [ ] The description explains in detail what this PR is about. - [x] I have linked a relevant issue or discussion. - [ ] I have created tests covering the changes. - [ ] I have updated the documentation accordingly. ### ⌛ Dependencies <!-- List all open PRs that this PR logically depends on - sagemath#12345: short description why this is a dependency - sagemath#34567: ... --> <!-- If you're unsure about any of these, don't hesitate to ask. We're here to help! --> URL: sagemath#36473 Reported by: Matthias Köppe Reviewer(s): Kwankyu Lee, Matthias Köppe
I'm taking care of this in #36498 |
…arate jobs for pyright, build, modularized tests, long tests <!-- ^^^^^ Please provide a concise, informative and self-explanatory title. Don't put issue numbers in there, do this in the PR body below. For example, instead of "Fixes sagemath#1234" use "Introduce new method to calculate 1+1" --> <!-- Describe your changes here in detail --> <!-- Why is this change required? What problem does it solve? --> Running the containers explicitly, instead of using the declarative `container:` feature of GH Actions gives us more control: - we can create more space on the host if necessary; we just scraped by an out of space condition in sagemath#36473 / sagemath#36469 (comment) - we can run some operations outside of the container but in the same job; this will make the separate "Get CI fixes" jobs unnecessary, addressing the cosmetic concerns from sagemath#36338 (comment), sagemath#36349 - it enables caching between the various workflows (as first discussed in sagemath#36446). We split out static type checking with pyright into its separate workflow: - **pyright.yml**: As static checking does not need a build of the Sage library, for PRs that do not make any changes to packages, there's nothing to build, and the workflow gives a fast turnaround just after 10 minutes. For applying the CI fixes from blocker tickets, this workflow uses the technique favored in sagemath#36349. The workflow **build.yml** first launches a job: - **test-new:** It builds incrementally (using a tox-generated `Dockerfile` and https://github.com/marketplace/actions/build-and-push- docker-images) and does the quick incremental test. This completes within 10 to 20 minutes when there's no change. After this is completed, more jobs are launched: - **test-mod:** It again builds incrementally and tests a modularized distribution. Later (with more from sagemath#35095), more jobs will be added to this matrix job for other distributions. - **test-long:** It again builds incrementally and runs the long test. The workflows **doc-build.yml** and **doc-build-pdf.yml** again build incrementally and then build the documentation. The diffing code for the HTML documentation now runs in the host, not the container, which simplifies things. (To show that diffing still works, we include a small change to the Sage library.) Splitting the workflow jobs implements @kwankyu's suggestion from: - sagemath#35652 (comment) (Fixes sagemath#35814) The steps "again builds incrementally" actually use the GH Actions cache, https://docs.docker.com/build/ci/github-actions/cache/#cache- backend-api. When nothing is cached and the 3 workflows are launched in parallel, they may each build the same thing. But when there's congestion that leads to the workflows to be run serially, the 2nd and 3rd workflow will be able to use the cache from the 1st workflow. This elasticity may be helpful in reducing congestion. There is a rather small per-project limit of 10 GB for this cache, so we'll have to see how effectively caching works in practice. See https://github.com/sagemath/sage/actions/caches <!-- If this PR resolves an open issue, please link to it here. For example "Fixes sagemath#12345". --> <!-- If your change requires a documentation PR, please link it appropriately. --> ### 📝 Checklist <!-- Put an `x` in all the boxes that apply. --> <!-- If your change requires a documentation PR, please link it appropriately --> <!-- If you're unsure about any of these, don't hesitate to ask. We're here to help! --> <!-- Feel free to remove irrelevant items. --> - [x] The title is concise, informative, and self-explanatory. - [x] The description explains in detail what this PR is about. - [x] I have linked a relevant issue or discussion. - [ ] I have created tests covering the changes. - [ ] I have updated the documentation accordingly. ### ⌛ Dependencies <!-- List all open PRs that this PR logically depends on - sagemath#12345: short description why this is a dependency - sagemath#34567: ... --> - Depends on sagemath#36938 (merged here) - Depends on sagemath#36951 (merged here) - Depends on sagemath#37351 (merged here) - Depends on sagemath#37750 (merged here) <!-- If you're unsure about any of these, don't hesitate to ask. We're here to help! --> URL: sagemath#36498 Reported by: Matthias Köppe Reviewer(s): Kwankyu Lee, Matthias Köppe
…arate jobs for pyright, build, modularized tests, long tests <!-- ^^^^^ Please provide a concise, informative and self-explanatory title. Don't put issue numbers in there, do this in the PR body below. For example, instead of "Fixes sagemath#1234" use "Introduce new method to calculate 1+1" --> <!-- Describe your changes here in detail --> <!-- Why is this change required? What problem does it solve? --> Running the containers explicitly, instead of using the declarative `container:` feature of GH Actions gives us more control: - we can create more space on the host if necessary; we just scraped by an out of space condition in sagemath#36473 / sagemath#36469 (comment) - we can run some operations outside of the container but in the same job; this will make the separate "Get CI fixes" jobs unnecessary, addressing the cosmetic concerns from sagemath#36338 (comment), sagemath#36349 - it enables caching between the various workflows (as first discussed in sagemath#36446). We split out static type checking with pyright into its separate workflow: - **pyright.yml**: As static checking does not need a build of the Sage library, for PRs that do not make any changes to packages, there's nothing to build, and the workflow gives a fast turnaround just after 10 minutes. For applying the CI fixes from blocker tickets, this workflow uses the technique favored in sagemath#36349. The workflow **build.yml** first launches a job: - **test-new:** It builds incrementally (using a tox-generated `Dockerfile` and https://github.com/marketplace/actions/build-and-push- docker-images) and does the quick incremental test. This completes within 10 to 20 minutes when there's no change. After this is completed, more jobs are launched: - **test-mod:** It again builds incrementally and tests a modularized distribution. Later (with more from sagemath#35095), more jobs will be added to this matrix job for other distributions. - **test-long:** It again builds incrementally and runs the long test. The workflows **doc-build.yml** and **doc-build-pdf.yml** again build incrementally and then build the documentation. The diffing code for the HTML documentation now runs in the host, not the container, which simplifies things. (To show that diffing still works, we include a small change to the Sage library.) Splitting the workflow jobs implements @kwankyu's suggestion from: - sagemath#35652 (comment) (Fixes sagemath#35814) The steps "again builds incrementally" actually use the GH Actions cache, https://docs.docker.com/build/ci/github-actions/cache/#cache- backend-api. When nothing is cached and the 3 workflows are launched in parallel, they may each build the same thing. But when there's congestion that leads to the workflows to be run serially, the 2nd and 3rd workflow will be able to use the cache from the 1st workflow. This elasticity may be helpful in reducing congestion. There is a rather small per-project limit of 10 GB for this cache, so we'll have to see how effectively caching works in practice. See https://github.com/sagemath/sage/actions/caches <!-- If this PR resolves an open issue, please link to it here. For example "Fixes sagemath#12345". --> <!-- If your change requires a documentation PR, please link it appropriately. --> ### 📝 Checklist <!-- Put an `x` in all the boxes that apply. --> <!-- If your change requires a documentation PR, please link it appropriately --> <!-- If you're unsure about any of these, don't hesitate to ask. We're here to help! --> <!-- Feel free to remove irrelevant items. --> - [x] The title is concise, informative, and self-explanatory. - [x] The description explains in detail what this PR is about. - [x] I have linked a relevant issue or discussion. - [ ] I have created tests covering the changes. - [ ] I have updated the documentation accordingly. ### ⌛ Dependencies <!-- List all open PRs that this PR logically depends on - sagemath#12345: short description why this is a dependency - sagemath#34567: ... --> - Depends on sagemath#36938 (merged here) - Depends on sagemath#36951 (merged here) - Depends on sagemath#37351 (merged here) - Depends on sagemath#37750 (merged here) <!-- If you're unsure about any of these, don't hesitate to ask. We're here to help! --> URL: sagemath#36498 Reported by: Matthias Köppe Reviewer(s): Kwankyu Lee, Matthias Köppe
…arate jobs for pyright, build, modularized tests, long tests <!-- ^^^^^ Please provide a concise, informative and self-explanatory title. Don't put issue numbers in there, do this in the PR body below. For example, instead of "Fixes sagemath#1234" use "Introduce new method to calculate 1+1" --> <!-- Describe your changes here in detail --> <!-- Why is this change required? What problem does it solve? --> Running the containers explicitly, instead of using the declarative `container:` feature of GH Actions gives us more control: - we can create more space on the host if necessary; we just scraped by an out of space condition in sagemath#36473 / sagemath#36469 (comment) - we can run some operations outside of the container but in the same job; this will make the separate "Get CI fixes" jobs unnecessary, addressing the cosmetic concerns from sagemath#36338 (comment), sagemath#36349 - it enables caching between the various workflows (as first discussed in sagemath#36446). We split out static type checking with pyright into its separate workflow: - **pyright.yml**: As static checking does not need a build of the Sage library, for PRs that do not make any changes to packages, there's nothing to build, and the workflow gives a fast turnaround just after 10 minutes. For applying the CI fixes from blocker tickets, this workflow uses the technique favored in sagemath#36349. The workflow **build.yml** first launches a job: - **test-new:** It builds incrementally (using a tox-generated `Dockerfile` and https://github.com/marketplace/actions/build-and-push- docker-images) and does the quick incremental test. This completes within 10 to 20 minutes when there's no change. After this is completed, more jobs are launched: - **test-mod:** It again builds incrementally and tests a modularized distribution. Later (with more from sagemath#35095), more jobs will be added to this matrix job for other distributions. - **test-long:** It again builds incrementally and runs the long test. The workflows **doc-build.yml** and **doc-build-pdf.yml** again build incrementally and then build the documentation. The diffing code for the HTML documentation now runs in the host, not the container, which simplifies things. (To show that diffing still works, we include a small change to the Sage library.) Splitting the workflow jobs implements @kwankyu's suggestion from: - sagemath#35652 (comment) (Fixes sagemath#35814) The steps "again builds incrementally" actually use the GH Actions cache, https://docs.docker.com/build/ci/github-actions/cache/#cache- backend-api. When nothing is cached and the 3 workflows are launched in parallel, they may each build the same thing. But when there's congestion that leads to the workflows to be run serially, the 2nd and 3rd workflow will be able to use the cache from the 1st workflow. This elasticity may be helpful in reducing congestion. There is a rather small per-project limit of 10 GB for this cache, so we'll have to see how effectively caching works in practice. See https://github.com/sagemath/sage/actions/caches <!-- If this PR resolves an open issue, please link to it here. For example "Fixes sagemath#12345". --> <!-- If your change requires a documentation PR, please link it appropriately. --> ### 📝 Checklist <!-- Put an `x` in all the boxes that apply. --> <!-- If your change requires a documentation PR, please link it appropriately --> <!-- If you're unsure about any of these, don't hesitate to ask. We're here to help! --> <!-- Feel free to remove irrelevant items. --> - [x] The title is concise, informative, and self-explanatory. - [x] The description explains in detail what this PR is about. - [x] I have linked a relevant issue or discussion. - [ ] I have created tests covering the changes. - [ ] I have updated the documentation accordingly. ### ⌛ Dependencies <!-- List all open PRs that this PR logically depends on - sagemath#12345: short description why this is a dependency - sagemath#34567: ... --> - Depends on sagemath#36938 (merged here) - Depends on sagemath#36951 (merged here) - Depends on sagemath#37351 (merged here) - Depends on sagemath#37750 (merged here) - Depends on sagemath#37277 (merged here) <!-- If you're unsure about any of these, don't hesitate to ask. We're here to help! --> URL: sagemath#36498 Reported by: Matthias Köppe Reviewer(s): Kwankyu Lee, Matthias Köppe
📚 Description
This gives a faster turnaround for errors in the committed changes.
When the incremental build (or the test) fails, we clean thoroughly and build sagelib from scratch (= what happened before this PR.)
Changing the docbuild CI in the same way.
The full doctests are now run with
--long
.Also:
sage -t --new
so it works in a git worktree and respects file-level annotations# sage.doctest: optional - ...
📝 Checklist
⌛ Dependencies