Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for multi-arch Docker images #1362

Merged
merged 2 commits into from
Apr 16, 2024

Conversation

shreyas-goenka
Copy link
Contributor

@shreyas-goenka shreyas-goenka commented Apr 15, 2024

Changes

This PR follows instructions in https://goreleaser.com/cookbooks/multi-platform-docker-images/ to create a multi-arch docker CLI image. Thus customers can simply specify docker pull ghcr.io/databricks/cli:latest to pull and run the image.

The current approach uses the docker manifest support in goreleaser to create a multi-arch image. This has a couple of pros and cons. TLDR; The changes as is in the PR are good to go and very low risk. The information provided here is just FYI.

pros:
Fewer configurations/workflows for us to manage/maintain. Goreleaser makes sure the correct CLI binary is in place when building the CLI and also takes care of publishing it to the Github Container Registry.

cons:
Goreleaser only supports docker manifest to create multi-arch images. This has a few minor disadvantages:

  1. goreleaser pushes all intermediate images (arm64 and and64 specific images) to the registry. This is required for the manifest to reference them. See: [Bug]: docker_manifest stage can't reference unpublished images from dockers stage goreleaser/goreleaser#2606

Note: We have a migration path here, if someday we stop publishing intermediate images, we can simply tag the "multi-arch" image as both latest-amd64 and latest-arm64. For now, these are separate images. see: https://github.com/databricks/cli/pkgs/container/cli

  1. docker manifest is technically an experimental command. Though it's been out for multiple years now and the indirect dependency by goreleaser should be fine. In any case, we can migrate by moving our docker build process off goreleaser if we need to.

Tests

Tested manually by publishing a new release for v0.0.0-docker in ghcr.io.

  1. Package: https://github.com/databricks/cli/pkgs/container/cli
  2. Release workflow: https://github.com/databricks/cli/actions/runs/8689359851

Tests the image itself by running it manually:

➜  cli git:(feature/multi-arch-docker) docker pull ghcr.io/databricks/cli:latest
latest: Pulling from databricks/cli
bca4290a9639: Already exists 
6d445556910d: Already exists 
Digest: sha256:82eabc500b541a89182aed4d3158c955e57c1e84d9616b76510aceb1d9024425
Status: Downloaded newer image for ghcr.io/databricks/cli:latest
ghcr.io/databricks/cli:latest

What's Next?
  View a summary of image vulnerabilities and recommendations → docker scout quickview ghcr.io/databricks/cli:latest
➜  cli git:(feature/multi-arch-docker) docker run ghcr.io/databricks/cli --version
Databricks CLI v0.0.0-docker

@pietern pietern changed the title Add support for multi-arch docker images Add support for multi-arch Docker images Apr 16, 2024
Copy link
Contributor

@pietern pietern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need the arch-specific tags for the intermediate images?

It would be nice if this is fully transparent and those don't even show up in the package list.

@shreyas-goenka
Copy link
Contributor Author

shreyas-goenka commented Apr 16, 2024

Do we still need the arch-specific tags for the intermediate images?

We do, unfortunately. It's a limitation of using goreleaser here to build the docker image:

goreleaser/goreleaser#2606
https://arc.net/l/quote/dgbbnzpi

@shreyas-goenka shreyas-goenka added this pull request to the merge queue Apr 16, 2024
Merged via the queue into main with commit 569bb1c Apr 16, 2024
5 checks passed
@shreyas-goenka shreyas-goenka deleted the feature/multi-arch-docker branch April 16, 2024 11:32
pietern added a commit that referenced this pull request Apr 23, 2024
This release marks the general availability of Databricks Asset Bundles.

CLI:
 * Publish Docker images ([#1353](#1353)).
 * Add support for multi-arch Docker images ([#1362](#1362)).
 * Do not prefill https:// in prompt for Databricks Host ([#1364](#1364)).
 * Add better documentation for the `auth login` command ([#1366](#1366)).
 * Add URLs for authentication documentation to the auth command help ([#1365](#1365)).

Bundles:
 * Fix compute override for foreach tasks ([#1357](#1357)).
 * Transform artifact files source patterns in build not upload stage ([#1359](#1359)).
 * Convert between integer and float in normalization ([#1371](#1371)).
 * Disable locking for development mode ([#1302](#1302)).
 * Resolve variable references inside variable lookup fields ([#1368](#1368)).
 * Added validate mutator to surface additional bundle warnings ([#1352](#1352)).
 * Upgrade terraform-provider-databricks to 1.40.0 ([#1376](#1376)).
 * Print host in `bundle validate` when passed via profile or environment variables ([#1378](#1378)).
 * Cleanup remote file path on bundle destroy ([#1374](#1374)).
 * Add docs URL for `run_as` in error message ([#1381](#1381)).
 * Enable job queueing by default ([#1385](#1385)).
 * Added support for job environments ([#1379](#1379)).
 * Processing and completion of positional args to bundle run ([#1120](#1120)).
 * Add legacy option for `run_as` ([#1384](#1384)).

API Changes:
 * Changed `databricks lakehouse-monitors cancel-refresh` command with new required argument order.
 * Changed `databricks lakehouse-monitors create` command with new required argument order.
 * Changed `databricks lakehouse-monitors delete` command with new required argument order.
 * Changed `databricks lakehouse-monitors get` command with new required argument order.
 * Changed `databricks lakehouse-monitors get-refresh` command with new required argument order.
 * Changed `databricks lakehouse-monitors list-refreshes` command with new required argument order.
 * Changed `databricks lakehouse-monitors run-refresh` command with new required argument order.
 * Changed `databricks lakehouse-monitors update` command with new required argument order.
 * Changed `databricks account workspace-assignment update` command to return response.

OpenAPI commit 94684175b8bd65f8701f89729351f8069e8309c9 (2024-04-11)

Dependency updates:
 * Bump github.com/databricks/databricks-sdk-go from 0.37.0 to 0.38.0 ([#1361](#1361)).
 * Bump golang.org/x/net from 0.22.0 to 0.23.0 ([#1380](#1380)).
@pietern pietern mentioned this pull request Apr 23, 2024
github-merge-queue bot pushed a commit that referenced this pull request Apr 23, 2024
This release marks the general availability of Databricks Asset Bundles.

CLI:
* Publish Docker images
([#1353](#1353)).
* Add support for multi-arch Docker images
([#1362](#1362)).
* Do not prefill https:// in prompt for Databricks Host
([#1364](#1364)).
* Add better documentation for the `auth login` command
([#1366](#1366)).
* Add URLs for authentication documentation to the auth command help
([#1365](#1365)).

Bundles:
* Fix compute override for foreach tasks
([#1357](#1357)).
* Transform artifact files source patterns in build not upload stage
([#1359](#1359)).
* Convert between integer and float in normalization
([#1371](#1371)).
* Disable locking for development mode
([#1302](#1302)).
* Resolve variable references inside variable lookup fields
([#1368](#1368)).
* Added validate mutator to surface additional bundle warnings
([#1352](#1352)).
* Upgrade terraform-provider-databricks to 1.40.0
([#1376](#1376)).
* Print host in `bundle validate` when passed via profile or environment
variables ([#1378](#1378)).
* Cleanup remote file path on bundle destroy
([#1374](#1374)).
* Add docs URL for `run_as` in error message
([#1381](#1381)).
* Enable job queueing by default
([#1385](#1385)).
* Added support for job environments
([#1379](#1379)).
* Processing and completion of positional args to bundle run
([#1120](#1120)).
* Add legacy option for `run_as`
([#1384](#1384)).

API Changes:
* Changed `databricks lakehouse-monitors cancel-refresh` command with
new required argument order.
* Changed `databricks lakehouse-monitors create` command with new
required argument order.
* Changed `databricks lakehouse-monitors delete` command with new
required argument order.
* Changed `databricks lakehouse-monitors get` command with new required
argument order.
* Changed `databricks lakehouse-monitors get-refresh` command with new
required argument order.
* Changed `databricks lakehouse-monitors list-refreshes` command with
new required argument order.
* Changed `databricks lakehouse-monitors run-refresh` command with new
required argument order.
* Changed `databricks lakehouse-monitors update` command with new
required argument order.
* Changed `databricks account workspace-assignment update` command to
return response.

OpenAPI commit 94684175b8bd65f8701f89729351f8069e8309c9 (2024-04-11)

Dependency updates:
* Bump github.com/databricks/databricks-sdk-go from 0.37.0 to 0.38.0
([#1361](#1361)).
* Bump golang.org/x/net from 0.22.0 to 0.23.0
([#1380](#1380)).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants