-
Notifications
You must be signed in to change notification settings - Fork 244
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[devdocs] Add dev doc describing gateways configuration
- Loading branch information
1 parent
2d41c6f
commit a4746e2
Showing
2 changed files
with
35 additions
and
0 deletions.
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
# Overview of the Batch Control Plane External and Internal Load Balancers | ||
|
||
Traffic flows into the Kubernetes cluster through one of two LoadBalancers: `gateway`, | ||
which receives traffic from the internet, and `internal-gateway`, which manages traffic | ||
from batch workers to the services in Kubernetes. | ||
|
||
Both of these services receive all traffic destined for services in the cluster and act | ||
as reverse proxies, routing those requests to the appropriate service, managing TLS, | ||
sometimes performing authorization checks, and enforcing rate limits. Our reverse proxy | ||
of choice is [Envoy](https://www.envoyproxy.io/). | ||
|
||
The general routing rules for the gateways are as follows (Kubernetes DNS provides addresses | ||
for `Service`s in the form of `<service>.<namespace>.svc.cluster.local`): | ||
|
||
### Gateway | ||
- `<service>.hail.is/<path> => <service>.default.svc.cluster.local/<path>` | ||
- `internal.hail.is/<dev-or-pr>/<service>/<path> => <service>.<dev-or-pr>.svc.cluster.local/<developer>/<service>/<path>`[^1] | ||
|
||
[^1]: At time of writing, developers cannot currently sign in to PR namespaces through the | ||
browser because they are not assigned a callback for GCP/Azure OAuth flows. | ||
|
||
|
||
### Internal Gateway | ||
- `<service>.hail/<path> => <service>.default.svc.cluster.local/<path>` | ||
- `internal.hail/<dev-or-pr>/<service>/<path> => <service>.<dev-or-pr>.svc.cluster.local/<developer>/<service>/<path>` | ||
|
||
For Envoy to properly pool connections to K8s services, it needs to know | ||
which "clusters" (services) exist at any point in time. This list is static for | ||
production services, but test/PR namespaces are ephemeral and are | ||
created/destroyed by CI many times per day. In order to notify the gateways | ||
of new namespaces/services, CI tracks which namespaces are active and periodically | ||
updates a K8s `ConfigMap` with fresh Envoy configuration. The gateways, using the | ||
[Envoy xDS API](https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/operations/dynamic_configuration#xds-configuration-api-overview) | ||
can dynamically load this new configuration as it changes without dropping existing traffic. | ||
You can see CI's current view of the cluster's namespaces/services at ci.hail.is/namespaces. |