Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sidecar Container Support #24834

Conversation

tdipadova3rd
Copy link

@tdipadova3rd tdipadova3rd commented Sep 27, 2024

Summary & Motivation

This PR adds support for sidecar containers for all Dagster pods, for versions of K8s after 1.29 (Native Sidecars).

Google recommends using Cloud SQL Auth Proxy to connect to Cloud SQL from Kubernetes. The proxy is deployed as a sidecar container, allowing application pods to connect to PostgreSQL as localhost while the proxy handles authentication and encryption over private IP. There is a discussion item open requesting support for Cloud SQL Auth Proxy in self-hosted Dagster.

How I Tested These Changes

I tested these changes by deploying Dagster on two GKE clusters, one running K8s 1.29 and one running K8s 1.30. Each deployment used the Cloud SQL Auth Proxy to connect to GCP Cloud SQL for the Webserver, Daemon, and Migrate Job.

K8s 1.29+
For versions after 1.29, you can deploy Dagster with Cloud SQL Auth Proxy using Native Sidecars.

K8s 1.29+ Example Config
runLauncher:
  config:
    k8sRunLauncher:
      runK8sConfig:
        podSpecConfig:
          initContainers:
            - name: cloud-sql-proxy
              restartPolicy: Always
              image: "gcr.io/cloud-sql-connectors/cloud-sql-proxy:2.1.0"
              args:
                - "--private-ip"
                - "--structured-logs"
                - "--port=5432"
                - "your-cloud-sql-instance"
dagsterWebserver:
  extraInitContainers:
    - name: cloud-sql-proxy
      restartPolicy: Always
      image: "gcr.io/cloud-sql-connectors/cloud-sql-proxy:2.1.0"
      args:
        - "--private-ip"
        - "--structured-logs"
        - "--port=5432"
        - "your-cloud-sql-instance"
dagsterDaemon:
  extraInitContainers:
    - name: cloud-sql-proxy
      restartPolicy: Always
      image: "gcr.io/cloud-sql-connectors/cloud-sql-proxy:2.1.0"
      args:
        - "--private-ip"
        - "--structured-logs"
        - "--port=5432"
        - "your-cloud-sql-instance"
migrate:
  extraInitContainers:
    - name: cloud-sql-proxy
      restartPolicy: Always
      image: "gcr.io/cloud-sql-connectors/cloud-sql-proxy:2.1.0"
      args:
        - "--private-ip"
        - "--structured-logs"
        - "--port=5432"
        - "your-cloud-sql-instance"

K8s before 1.29
This change only enabled sidecar support for K8s clusters with Native Sidecar functionality enabled. The Native Sidecar feature entered beta in K8s 1.29 and is available now on most hosted K8s providers.

Changelog

Insert changelog entry or "NOCHANGELOG" here.

  • NEW (added new feature or capability)
  • BUGFIX (fixed a bug)
  • DOCS (added or updated documentation)

Copy link
Contributor

@mlarose mlarose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tdipadova3rd

Thank you very much for this contribution.

For versions before 1.29, you can deploy Dagster with Cloud SQL Auth Proxy using a few custom commands. For these versions, you cannot use the K8sRunLauncher, instead you need to configure deployments to use the DefaultRunLauncher, which runs Dagster runs as new processes in the Code Location pod.

That's a quite severe limitation and, combined with the other workarounds (such as pgIsReadyCommandOverride and customMigrateCommand with arbitrary sleeps), I think the tradeoffs and potential land mines (seldomly hitting those timeouts leading to hard to debug failures) introduced to support <1.29 may not be worthwhile. How would you feel about dropping that from this PR?

@tdipadova3rd
Copy link
Author

tdipadova3rd commented Oct 9, 2024

@mlarose Thank you for the review! I've removed support for K8s versions <1.29 and I've renamed extraInitContainers to extraPrependedInitContainers where appropriate. I've deployed and tested these changes in my clusters. Please let me know if there's any other changes you'd like.

@garethbrickman garethbrickman requested a review from mlarose October 9, 2024 20:45
@mlarose mlarose requested a review from gibsondan October 10, 2024 19:15
@mlarose
Copy link
Contributor

mlarose commented Oct 10, 2024

@gibsondan - this one looks good to me but i'd appreciate if you take a glance here.

@mlarose
Copy link
Contributor

mlarose commented Oct 10, 2024

@mlarose Thank you for the review! I've removed support for K8s versions <1.29 and I've renamed extraInitContainers to extraPrependedInitContainers where appropriate. I've deployed and tested these changes in my clusters. Please let me know if there's any other changes you'd like.

@tdipadova3rd my understanding is you don't need the extraContainers anymore to support native 1.29 sidecars, but are keeping it for use others than supporting the google proxy. I suppose it's ok leave that feature there.

However some comments and the PR description are a bit confusing.

Specifically the 1.28 example in this description and I will point out a few comments where disambiguating sidecars (the pattern vs the kubernetes feature) might benefit the user.

# For K8s versions after 1.29, see:
# https://kubernetes.io/docs/concepts/workloads/pods/sidecar-containers/
#
# Extra init containers are started **before** the check-db-ready init container.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless you are reading the helm's chart implementation, which most users won't, the check-db-ready is unknown to readers of this configuration file.

Perhaps I'd wordsmith this as "Extra init containers are started before the connection to the database is tested."

@@ -1280,6 +1326,26 @@ extraManifests:
migrate:
enabled: false

# Additional containers that should be run as sidecars to the migration job. See:
# https://kubernetes.io/docs/concepts/workloads/pods/#how-pods-manage-multiple-containers
# For K8s versions after 1.29, prefer using initContainers instead. See:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# For K8s versions after 1.29, prefer using initContainers instead. See:
# For K8s versions after 1.29, prefer using initContainers to use native sidecars instead. See:

# image: busybox
extraContainers: []

# Init containers that should run before the main job container, such as sidecars. See:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Init containers that should run before the main job container, such as sidecars. See:
# Init containers that should run before the main job container, such as native sidecars. See:

# For K8s versions after 1.29, see:
# https://kubernetes.io/docs/concepts/workloads/pods/sidecar-containers/
#
# Extra init containers are started **before** the check-db-ready init container.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see other comment

@mlarose mlarose assigned mlarose and tdipadova3rd and unassigned mlarose Oct 10, 2024
@gibsondan
Copy link
Member

Hey @tdipadova3rd - there are a couple of formatting errors that our CI picked up as well: https://buildkite.com/dagster/unit-tests/builds/1055#019277e0-1939-46cc-9123-a1b97cd2bfc9 - you should be able to run it locally via "make ruff" once they're fixed to confirm so you don't have to wait for us to unblock it again.

Copy link
Member

@gibsondan gibsondan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this might require a rebase now that a couple of months have passed. Still open to landing this once it passes CI!

hom3r pushed a commit to hom3r/dagster that referenced this pull request Jan 7, 2025
gibsondan pushed a commit that referenced this pull request Jan 11, 2025
## Summary & Motivation
This PR adds support for sidecar containers for all Dagster pods, for
versions of K8s after 1.29 ([Native
Sidecars](https://kubernetes.io/docs/concepts/workloads/pods/sidecar-containers/)).

Note: This is a copy of the PR #24834 from September 2024. The original
author @tdipadova3rd has submitted this feature but, unfortunately,
hasn't responded to some follow-up comments. The CI/CD pipeline also
failed due to code formatting issues. I reintegrated his changes into
the latest upstream version while fixing the code formatting issues.

Original summary: Google recommends using Cloud SQL Auth Proxy to
connect to Cloud SQL from Kubernetes. The proxy is deployed as a sidecar
container, allowing application pods to connect to PostgreSQL as
localhost while the proxy handles authentication and encryption over
private IP. There is a
#9086 requesting
support for Cloud SQL Auth Proxy in self-hosted Dagster.

## How I Tested These Changes
I haven't tested the changes yet. I would like to check if the CI/CD
pipeline passes first, and then I'll try to test the feature on a
cluster.

## Changelog

- NEW (added new feature or capability)

Add support for sidecar containers for all Dagster pods, for versions of
K8s after 1.29 ([Native
Sidecars](https://kubernetes.io/docs/concepts/workloads/pods/sidecar-containers/)).

Resolves #24834 

Original author @tdipadova3rd.

---------

Co-authored-by: David Beran <david.beran@telekom.com>
marijncv pushed a commit to marijncv/dagster that referenced this pull request Jan 21, 2025
## Summary & Motivation
This PR adds support for sidecar containers for all Dagster pods, for
versions of K8s after 1.29 ([Native
Sidecars](https://kubernetes.io/docs/concepts/workloads/pods/sidecar-containers/)).

Note: This is a copy of the PR dagster-io#24834 from September 2024. The original
author @tdipadova3rd has submitted this feature but, unfortunately,
hasn't responded to some follow-up comments. The CI/CD pipeline also
failed due to code formatting issues. I reintegrated his changes into
the latest upstream version while fixing the code formatting issues.

Original summary: Google recommends using Cloud SQL Auth Proxy to
connect to Cloud SQL from Kubernetes. The proxy is deployed as a sidecar
container, allowing application pods to connect to PostgreSQL as
localhost while the proxy handles authentication and encryption over
private IP. There is a
dagster-io#9086 requesting
support for Cloud SQL Auth Proxy in self-hosted Dagster.

## How I Tested These Changes
I haven't tested the changes yet. I would like to check if the CI/CD
pipeline passes first, and then I'll try to test the feature on a
cluster.

## Changelog

- NEW (added new feature or capability)

Add support for sidecar containers for all Dagster pods, for versions of
K8s after 1.29 ([Native
Sidecars](https://kubernetes.io/docs/concepts/workloads/pods/sidecar-containers/)).

Resolves dagster-io#24834 

Original author @tdipadova3rd.

---------

Co-authored-by: David Beran <david.beran@telekom.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants