-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support arm64 in cpubuilder dockerfile using multi-arch builds. #11
Conversation
@@ -73,7 +74,8 @@ jobs: | |||
uses: docker/build-push-action@f2a1d5e99d037542a71f64918e516c093c6f3fc4 | |||
with: | |||
context: . | |||
file: dockerfiles/cpubuilder_ubuntu_jammy_ghr_x86_64.Dockerfile | |||
file: dockerfiles/cpubuilder_ubuntu_jammy_ghr.Dockerfile | |||
platforms: linux/amd64,linux/arm64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm expecting these multi-arch/platform builds to show up in github packages like so: https://github.com/myoung34/docker-github-actions-runner/pkgs/container/docker-github-actions-runner/275948281?tag=ubuntu-jammy
Nice! I've skimmed through and all looks good to me. From what I can tell, the only platform-dependant part is the installation of Btw, how can I add |
Yep! The rest leans on already published multi-arch packages and portable scripts. Turns out multi-arch builds work well when they are single stage. We made things overcomplicated on iree-org/iree#14372 with the multistage builds (
We still have an arm64 runner in the IREE repo for now (that's the only runner that Google is still providing), so I was planning on porting https://github.com/iree-org/iree/blob/main/.github/workflows/ci_linux_arm64_clang.yml after landing this and getting the first images published to GitHub's registry. Might need a bit of back and forth as I can't test the workflows and dockerfile changes in a single PR.
The install steps from
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This look really nice!
8da8f75
to
630a91b
Compare
@@ -38,7 +38,8 @@ jobs: | |||
uses: docker/build-push-action@f2a1d5e99d037542a71f64918e516c093c6f3fc4 | |||
with: | |||
context: . | |||
file: dockerfiles/cpubuilder_ubuntu_jammy_x86_64.Dockerfile | |||
file: dockerfiles/cpubuilder_ubuntu_jammy.Dockerfile | |||
platforms: linux/amd64,linux/arm64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/iree-org/base-docker-images/actions/runs/10949036140/job/30401394373#step:5:135
ERROR: Multi-platform build is not supported for the docker driver.
Switch to a different driver, or turn on the containerd image store, and try again.
Learn more at https://docs.docker.com/go/build-multi-platform/
Error: buildx failed with: Learn more at https://docs.docker.com/go/build-multi-platform/
Aww... I hit the same error locally while developing. The recommended steps in those docs worked for me, but I need to see what the docker/build-push-action needs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed? a61eb38
So far so good: https://github.com/iree-org/base-docker-images/actions/runs/10949139831
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, successful build and push. The workflow run time went up from 2 minutes to 20 minutes, in large part due to the ccache build from source.
Before: https://github.com/iree-org/base-docker-images/actions/runs/10948990985
After: https://github.com/iree-org/base-docker-images/actions/runs/10949139831
New package: https://github.com/iree-org/base-docker-images/pkgs/container/cpubuilder_ubuntu_jammy
Starting to test at https://github.com/iree-org/iree/actions/runs/10949782485/job/30403753082 with this code: https://github.com/iree-org/iree/compare/users/scotttodd/ci-arm64?expand=1. setup-qemu-action did not do what I wanted :P |
Follow-up to #11. This emulator was originally installed in iree-org/iree#16331. We've been carrying it around since then. I'm not thrilled with the file sitting in a cloud storage bucket (GCS or Azure). The file is 5MB so maybe we could check it in here via git LFS? Or we could build it from source via our automation.
Progress on #15332. This was the last active use of [`build_tools/docker/`](https://github.com/iree-org/iree/tree/main/build_tools/docker), so we can now delete that directory: #18566. This uses the same "cpubuilder" dockerfile as the x86_64 builds, which is now built for multiple architectures thanks to iree-org/base-docker-images#11. As before, we install a qemu binary in the dockerfile, this time using the approach in iree-org/base-docker-images#13 instead of a forked dockerfile. Prior PRs for context: * #14372 * #16331 Build time varies pretty wildly depending on cache hit rate and the phase of the moon: | Scenario | Cache hit rate | Time | Logs | | -- | -- | -- | -- | Cold cache | 0% | 1h45m | [Logs](https://github.com/iree-org/iree/actions/runs/10962049593/job/30440393279) Warm (?) cache | 61% | 48m | [Logs](https://github.com/iree-org/iree/actions/runs/10963546631/job/30445257323) Warm (hot?) cache | 98% | 16m | [Logs](https://github.com/iree-org/iree/actions/runs/10964289304/job/30447618503?pr=18569) CI history (https://github.com/iree-org/iree/actions/workflows/ci_linux_arm64_clang.yml?query=branch%3Amain) shows that regular 97% cache hit rates and 17 minute job times are possible. I'm not sure why one test run only got 61% cache hits. This job only runs nightly, so that's not a super high priority to investigate and fix. If we migrate the arm64 runner off of GCP (#18238) we can further simplify this workflow by dropping its reliance on `gcloud auth application-default print-access-token` and the `docker_run.sh` script. Other workflows are now using `source setup_sccache.sh` and some other code.
In iree-org/base-docker-images#11, I replaced the single-architecture `cpubuilder_ubuntu_jammy_x86_64` dockerfile with a multi-architecture `cpubuilder_ubuntu_jammy` dockerfile. This checks that the `linux/amd64` platform build of the dockerfile still works for our usage.
This should let us replace https://github.com/iree-org/iree/blob/main/build_tools/docker/dockerfiles/base-arm64.Dockerfile with this cpubuilder dockerfile, making progress on iree-org/iree#15332. It uses a multi-architecture build rather than a fully forked file, which seems to work reasonably well with some local testing. I can even run the arm64 dockerfile on my x86_64 host.
Various related changes are included here:
x86_64
from file namespublish_cpubuilder_x86_64.yml
-->publish_cpubuilder.yml
cpubuilder_ubuntu_jammy_x86_64.Dockerfile
-->cpubuilder_ubuntu_jammy.Dockerfile
cpubuilder_ubuntu_jammy_ghr_x86_64.Dockerfile
-->cpubuilder_ubuntu_jammy_ghr.Dockerfile
--platform linux/amd64,linux/arm64
and update docs for thisbuild_tools/install_ccache.sh
(code lifted from https://github.com/iree-org/iree/blob/main/build_tools/docker/context/install_ccache.sh). Note that if we standardize on sccache we can drop the ccache install entirely