Skip to content

Commit

Permalink
[Custom DC] Various setup improvements (#2349)
Browse files Browse the repository at this point in the history
Cluster
- Migrate the cluster management from shell script to Terraform module.
This allows Terraform to also delete the GKE cluster, 1 step closer for
E2E test
- Allow zonal cluster instead of enforcing regional clusters, which
should cut the resource by 2/3. Location specifies whether a zonal or a
regional cluster is created. ex: "us-cental1" -> regional,
"us-central1-a" -> zonal. The difference is 99.95% SLA vs 99.5% SLA. I
think most POCs/dev projects makes sense to be zonal.

Helm chart
- Allow gcr repo project to be specified, this is a follow up for
@shifucun's support for building images in custom dc project
- Move helm out of Terraform management. In the past few weeks, issue 1
is that it was easy to get into a circular situation where the state
needs to be read at the beginning (reading what resource a cluster has
requires kube config), but the kube config is yet to be fetched because
it happens in the part after reading the states. May be cleaner to
separate GCP resource level and k8s resource levels.
- This does not affect GCP level resource lifecycle, as even if pods and
services are in a cluster, the cluster can still be deleted.
- Taking the k8s resources out of the Terraform lifecycle may make
things easier when integrating with Argo, as Argo changes will not touch
Terraform at all.

---------

Co-authored-by: Alex Chen <alexyfchen@google.com>
  • Loading branch information
Fructokinase and Alex Chen authored Mar 2, 2023
1 parent 3c50bc5 commit ddc1dfd
Show file tree
Hide file tree
Showing 14 changed files with 127 additions and 195 deletions.
12 changes: 6 additions & 6 deletions deploy/helm_charts/dc_website/templates/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ spec:
name: schema-mapping{{ .Values.resourceSuffix }}
containers:
- name: website
image: "{{ .Values.website.image.repository }}:{{ .Values.website.image.tag }}"
image: "gcr.io/{{ .Values.website.image.project }}/datacommons-website:{{ .Values.website.image.tag }}"
imagePullPolicy: {{ .Values.website.image.pullPolicy }}
args: []
ports:
Expand All @@ -75,9 +75,9 @@ spec:
periodSeconds: 10
resources:
limits:
memory: "8G"
memory: "3G"
requests:
memory: "8G"
memory: "3G"
volumeMounts:
- name: ai-config
mountPath: /datacommons/ai
Expand Down Expand Up @@ -117,7 +117,7 @@ spec:
- name: BOUNCE
value: "dummy"
- name: mixer
image: "{{ .Values.mixer.image.repository }}:{{ .Values.mixer.image.tag | default .Chart.AppVersion }}"
image: "gcr.io/{{ .Values.mixer.image.project }}/datacommons-mixer:{{ .Values.mixer.image.tag }}"
imagePullPolicy: {{ .Values.mixer.image.pullPolicy }}
resources:
limits:
Expand Down Expand Up @@ -201,9 +201,9 @@ spec:
key: serviceName
resources:
limits:
memory: "2G"
memory: "1G"
requests:
memory: "2G"
memory: "1G"
readinessProbe:
httpGet:
path: /healthz
Expand Down
5 changes: 2 additions & 3 deletions deploy/helm_charts/dc_website/templates/ingress.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,10 +29,9 @@ kind: Ingress
metadata:
name: {{ .Values.ingress.name }}
namespace: {{ .Values.namespace.name }}
{{- with .Values.ingress.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
kubernetes.io/ingress.global-static-ip-name: dc-website-ip
ingress.gcp.kubernetes.io/pre-shared-cert: dc-website-cert{{ .Values.resourceSuffix }}
spec:
rules:
- http:
Expand Down
9 changes: 4 additions & 5 deletions deploy/helm_charts/dc_website/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

resourceSuffix:
resourceSuffix: ""

# Website service config.
website:
Expand All @@ -13,11 +13,11 @@ website:
githash:

image:
repository: gcr.io/datcom-ci/datacommons-website
project: "datcom-ci"
pullPolicy: Always
tag:

flaskEnv:
flaskEnv: "custom"
secretGCPProjectID:
enableModel: false

Expand Down Expand Up @@ -45,7 +45,6 @@ serviceAccount:
ingress:
name: website-ingress
enabled:
annotations: { kubernetes.io/ingress.global-static-ip-name: mixer-ip }

###############################################################################
# Config for Mixer helm chart
Expand All @@ -55,7 +54,7 @@ mixer:
githash:

image:
repository: gcr.io/datcom-ci/datacommons-mixer
project: "datcom-ci"
pullPolicy: Always
tag:

Expand Down

This file was deleted.

49 changes: 2 additions & 47 deletions deploy/terraform-datacommons-website/examples/website_v1/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ module "apikeys" {
source = "../../modules/apikeys"
project_id = var.project_id
dc_website_domain = var.dc_website_domain
location = var.region
location = var.location

resource_suffix = local.resource_suffix
}
Expand All @@ -78,7 +78,7 @@ module "esp" {
module "cluster" {
source = "../../modules/gke"
project_id = var.project_id
region = var.region
location = var.location
cluster_name_prefix = var.cluster_name_prefix
web_robot_sa_email = local.web_robot_sa_email

Expand All @@ -99,48 +99,3 @@ resource "google_compute_managed_ssl_certificate" "dc_website_cert" {
domains = [format("%s.", var.dc_website_domain)]
}
}

# IMPORTANT NOTE: This script assumes that
# "~/.kube/config" already exists. This is because provider cannot depend on data or resources,
# as provider blocks need to be determined before resources/data states are fetched.
# In install_custom_dc.sh, currentlythe kubeconfig is fetched before calling terraform apply.
# .kube/config is the location where gcloud command for GKE stores cluster config, which
# is required to access the cluster, including using helm.
provider "kubernetes" {
alias = "datcom"
kubernetes {
config_path = "~/.kube/config"
}
}

provider "helm" {
alias = "datcom"
kubernetes {
config_path = "~/.kube/config"
}
}

module "k8s_resources" {
providers = {
kubernetes = kubernetes.datcom
helm = helm.datcom
}

resource_suffix = local.resource_suffix
website_githash = var.website_githash
mixer_githash = var.mixer_githash

source = "../../modules/helm"
project_id = var.project_id

cluster_name = module.cluster.name
cluster_region = var.region
dc_website_domain = var.dc_website_domain
global_static_ip_name = format("dc-website-ip%s", local.resource_suffix)
managed_cert_name = google_compute_managed_ssl_certificate.dc_website_cert.name

depends_on = [
google_compute_managed_ssl_certificate.dc_website_cert,
module.cluster
]
}
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@ output "cluster_name" {
value = module.cluster.name
}

output "cluster_region" {
value = module.cluster.region
output "cluster_location" {
value = module.cluster.location
}
Original file line number Diff line number Diff line change
Expand Up @@ -45,10 +45,21 @@ variable "cluster_name_prefix" {
default = "datacommons"
}

variable "region" {
variable "location" {
type = string
description = "GCP region where the cluster will be created in."
default = "us-central1"
description = <<EOF
Location of the GCP resources to be created.
Can be regional, like "us-central1". Or zonal like "us-central1-a"
Major difference between regional and zonal is that for GKE cluster, regional
clusters will have nodes in each zone of that region, giving higher availability,
but is more expensive.
For regional only resources, if zonal location is specified, the region
will be parsed from the zone.
EOF
default = "us-central1-a"
}

variable "global_static_ip_name" {
Expand Down
3 changes: 2 additions & 1 deletion deploy/terraform-datacommons-website/modules/apikeys/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,8 @@ resource "google_secret_manager_secret" "maps_api_key_secret" {
replication {
user_managed {
replicas {
location = var.location
# location needs to be a region here, so if var.location is a zone, make it a region.
location = length(split(var.location, "-")) == 2 ? var.location : join("-", slice(split( "-", var.location), 0, 2))
}
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,14 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Region is like "us-central1", zone is like "us-central1-a"
if [[ $LOCATION =~ ^[a-z]+-[a-z0-9]+$ ]]; then
REGION=$LOCATION
else
ZONE=$LOCATION
fi
gcloud container clusters get-credentials $CLUSTER_NAME \
--region $REGION --project=$PROJECT_ID
${REGION:+--region $REGION} ${ZONE:+--zone $ZONE} --project=$PROJECT_ID

# Create namespace if it does not exist.
kubectl create namespace website \
Expand Down
24 changes: 0 additions & 24 deletions deploy/terraform-datacommons-website/modules/gke/create_cluster.sh

This file was deleted.

Loading

0 comments on commit ddc1dfd

Please sign in to comment.