Skip to content

Commit

Permalink
[devdocs] Add dev doc describing gateways configuration
Browse files Browse the repository at this point in the history
  • Loading branch information
daniel-goldstein committed Apr 29, 2024
1 parent 2d41c6f commit c4c5177
Show file tree
Hide file tree
Showing 2 changed files with 35 additions and 0 deletions.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
35 changes: 35 additions & 0 deletions dev-docs/services/gateways.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Overview of the Batch Control Plane External and Internal Load Balancers

Traffic flows into the Kubernetes cluster through two points of ingress: `gateway`,
which receives traffic from the internet, and `internal-gateway`, which manages traffic
from batch workers to the services in Kubernetes.

These reverse proxies/load balancers handle traffic routing to the appropriate
namespace/service, manage TLS, perform additional authorization checks for non-prod
namespaces, and enforce rate limits.
Our reverse proxy of choice is [Envoy](https://www.envoyproxy.io/).

The general routing rules for the gateways are as follows (Kubernetes DNS provides addresses
for `Service`s in the form of `<service>.<namespace>.svc.cluster.local`):

### Gateway
- <service>.hail.is/<path> => <service>.default.svc.cluster.local/<path>
- internal.hail.is/<dev-or-pr>/<service>/<path> => <service>.<dev-or-pr>.svc.cluster.local/<developer>/<service>/<path>[^1]

[^1]: At time of writing, developers cannot currently sign in to PR namespaces through the
browser because they are not assigned a callback for GCP/Azure OAuth flows.


### Internal Gateway
- <service>.hail/<path> => <service>.default.svc.cluster.local/<path>
- internal.hail/<dev-or-pr>/<service>/<path> => <service>.<dev-or-pr>.svc.cluster.local/<developer>/<service>/<path>

For Envoy to properly pool connections to K8s services, it needs to know
which "clusters" (services) exist at any point in time. This list is static for
production services, but PR namespaces are ephemeral and are
created/destroyed by CI many times per day. In order to notify the gateways
of new namespaces/services, CI tracks which namespaces are active and periodically
updates a K8s `ConfigMap` with fresh Envoy configuration. The gateways, using the
[Envoy xDS API](https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/operations/dynamic_configuration#xds-configuration-api-overview)
can dynamically load this new configuration as it changes without dropping existing traffic.
You can see CI's current view of the cluster's namespaces/services at ci.hail.is/namespaces.

0 comments on commit c4c5177

Please sign in to comment.