Skip to content

Commit

Permalink
Use GCS for Windows ccache (#13183)
Browse files Browse the repository at this point in the history
We have found the GitHub actions built-in caching mechanism to be
extremely limiting: slow, small, and buggy. Switch instead to using our
own remote ccache hosted on GCS. This matches our Linux builds on our
self-hosted runners except that we have to do GCS auth through service
account keys, unfortunately, which means that access is restricted to
postsubmit runs. Luckily, for these builds we're generally doing
everything in one job and just want caching (which we only write on
postsubmit anyway) and don't need artifact storage (which we'd need on
presubmit too).

Tested:
Ran on this PR (hacked the workflow a bit). An
[initial
run](https://github.com/openxla/iree/actions/runs/4750257226/jobs/8438272681)
with an empty cache took 28m total, 15.5m of which was in the build
step. This includes writing the remote cache (minor overhead). A

[rerun](https://github.com/openxla/iree/actions/runs/4750257226/jobs/8438619413)
with a now populated cache took 14m total, 6.5m of which was in the
build step. 79% of compiler calls were cacheable and of those 99%
were remote cache hits. Contrast with a
[recent post-submit
run](https://github.com/openxla/iree/actions/runs/4748717136/jobs/8435229260)
that ran on a docs-only change (so should've had a maximally populated
cache), which took 20m, 7m of which was the build step, 2m of which was
fetching the cache, and 1m of which was saving the cache. That's
setting aside
[runs like this
one](https://github.com/openxla/iree/actions/runs/4741863995/jobs/8419465087)
where fetching the cache just times out entirely (with no alerting
other than if you happen to look at the UI).

Tragically, most of the time in all of these jobs is spent just
checking out the repository and submodules (see
actions/checkout#1186).

Overall this seems like a marked improvement. The main wins are in
avoiding tons of complexity futzing with cache compression levels and
restoring and saving the cache (actual cached build time is
~unchanged).

Part of #13028

skip-ci: Windows builds don't run on presubmit
  • Loading branch information
GMNGeoffrey authored Apr 20, 2023
1 parent de2ecca commit 0ab01b6
Showing 1 changed file with 12 additions and 33 deletions.
45 changes: 12 additions & 33 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -83,19 +83,15 @@ jobs:
BUILD_DIR: build-windows
IREE_VULKAN_DISABLE: 1
steps:
- id: "gcp-auth"
name: "Authenticating to Google Cloud"
uses: "google-github-actions/auth@v1"
with:
token_format: "access_token"
credentials_json: "${{ secrets.IREE_OSS_GITHUB_RUNNER_BASIC_TRUST_SERVICE_ACCOUNT_KEY }}"
create_credentials_file: false
- name: "Checking out repository"
uses: actions/checkout@ac593985615ec2ede58e132d2e21d2b1cbd6127c # v3.3.0
# Attempt to restore from cache unconditionally.
# Note: this will first try to grab a cache entry for this exact commit
# then it will fall back to the latest for any commit.
- name: "Fetching cache (CMake/ccache)"
uses: actions/cache/restore@88522ab9f39a2ea568f7027eddc7d8d8bc9d59c8 # v3.3.1
with:
path: ${{ github.workspace }}/.ccache
key: ccache_all_windows_${{ github.sha }}
restore-keys: ccache_all_windows
# Fetch dependencies.
# TODO(scotttodd): Move some of these into a Docker image / add to PATH.
- name: "Updating git submodules"
run: git submodule update --init --jobs 8 --depth 1
- name: "Setting up Python"
Expand All @@ -114,30 +110,13 @@ jobs:
# Finally: build and run tests.
- name: "Building IREE"
env:
IREE_READ_REMOTE_CCACHE: 0
IREE_WRITE_REMOTE_CCACHE: 0
IREE_READ_LOCAL_CCACHE: 1
IREE_WRITE_LOCAL_CCACHE: ${{ needs.setup.outputs.write-caches }}
CCACHE_DIR: ${{ github.workspace }}/.ccache
# Cache size and compression level settings are a delicate balance.
# * A full build cache is around 2-5GB depending on compression level
# * Upload/download is slow (double compression may or may not help)
# * We have a limit of 10GB across all cached files per repository
# * Cache misses are quite costly:
# * 99% cache hits -> ~5 minutes to build
# * 20% cache hits -> ~15-20 minutes to build
CCACHE_MAXSIZE: 4G
CCACHE_COMPRESSLEVEL: 5
run: ./build_tools/cmake/build_all.sh "${BUILD_DIR}"
IREE_WRITE_REMOTE_CCACHE: ${{ needs.setup.outputs.write-caches }}
IREE_CCACHE_GCP_TOKEN: ${{ steps.gcp-auth.outputs.access_token }}
CCACHE_NAMESPACE: github-windows-2022-64core
run: |
./build_tools/cmake/build_all.sh "${BUILD_DIR}"
- name: "Testing IREE"
run: ./build_tools/cmake/ctest_all.sh "${BUILD_DIR}"
# Write cache (if configured to) after all other steps are finished.
- name: "Saving cache (CMake/ccache)"
if: needs.setup.outputs.write-caches == '1'
uses: actions/cache/save@88522ab9f39a2ea568f7027eddc7d8d8bc9d59c8 # v3.3.1
with:
path: ${{ github.workspace }}/.ccache
key: ccache_all_windows_${{ github.sha }}

build_test_all_macos_arm64:
needs: setup
Expand Down

0 comments on commit 0ab01b6

Please sign in to comment.