Skip to content

Commit

Permalink
Migrate standalone deployment to workload identity on GCP (#2619)
Browse files Browse the repository at this point in the history
* Script to set up workload identity for standalone deployment

* Migrate tests to run on standalone + workload identity

* Fix test script

* Switch to static GSAs for testing, because they have name length limit

* Add workload identity binding for argo

* Fix argo workload identity bindings

* Remove user-gcp-sa from tests

* Remove use_gcp_secret from xgboost sample

* Allow debugging tests locally

* Wait for policies to take effect

* Update deploy-pipeline-lite.sh

* Update deploy-pipeline-lite.sh

* [WIP] test gcloud auth list with test-runner sa

* Add namespace

* test again

* Use new image builder

* test again

* Remove debug code

* Remove usages of use_gcp_secret

* Fix unit test and tensorboard pod template

* Add debug code again to test

* Try waiting until workload identity bindings are ready

* Fix some other samples

* Fix parameterized tfx oss sample

* Add retry to image building

* Try fixing tfx oss sample

* Fix compiled tfx oss sample

* Update all google/cloud-sdk to latest

* Try fixing parameterized tfx oss sample again

* Also verify pipeline-runner ksa is working

* Fix parameterized_tfx_oss sample

* Update gcp-workload-identity-setup.sh

* Revert unneeded change

* Pin to new google/cloud-sdk

* Remove wrongly commited binaries
  • Loading branch information
Bobgy authored and k8s-ci-robot committed Dec 17, 2019
1 parent 3d008f9 commit 4a8d262
Show file tree
Hide file tree
Showing 32 changed files with 3,390 additions and 193 deletions.
28 changes: 1 addition & 27 deletions frontend/server/k8s-helper.ts
Original file line number Diff line number Diff line change
Expand Up @@ -44,33 +44,7 @@ const workflowPlural = 'workflows';
/** Default pod template spec used to create tensorboard viewer. */
export const defaultPodTemplateSpec = {
spec: {
containers: [
{
env: [
{
name: 'GOOGLE_APPLICATION_CREDENTIALS',
value: '/secret/gcp-credentials/user-gcp-sa.json',
},
],
volumeMounts: [
{
name: 'gcp-credentials',
mountPath: '/secret/gcp-credentials/user-gcp-sa.json',
readOnly: true,
},
],
},
],
volumes: [
{
name: 'gcp-credentials',
volumeSource: {
secret: {
secretName: 'user-gcp-sa',
},
},
},
],
containers: [{}],
},
};

Expand Down
131 changes: 131 additions & 0 deletions manifests/kustomize/gcp-workload-identity-setup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
#!/bin/bash
#
# Copyright 2019 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

set -e

# Google service Account (GSA)
SYSTEM_GSA=${SYSTEM_GSA:-$CLUSTER_NAME-kfp-system}
USER_GSA=${USER_GSA:-$CLUSTER_NAME-kfp-user}

# Kubernetes Service Account (KSA)
SYSTEM_KSA=(ml-pipeline-ui)
USER_KSA=(pipeline-runner default) # default service account is used for container building, TODO: give it a specific name

cat <<EOF
It is recommended to first review introduction to workload identity: https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity.
This script sets up Google service accounts and workload identity bindings for a Kubeflow Pipelines (KFP) standalone deployment.
You can also choose to manually set these up based on documentation: https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity.
Before you begin, please check the following list:
* gcloud is configured following steps: https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity#before_you_begin.
* KFP is already deployed by standalone deployment: https://www.kubeflow.org/docs/pipelines/standalone-deployment-gcp/.
* kubectl talks to the cluster KFP is deployed to: https://cloud.google.com/kubernetes-engine/docs/how-to/cluster-access-for-kubectl.
The following resources will be created to bind workload identity between GSAs and KSAs:
* Google service accounts (GSAs): $SYSTEM_GSA and $USER_GSA.
* Service account IAM policy bindings.
* Kubernetes service account annotations.
EOF

NAMESPACE=${NAMESPACE:-kubeflow}
function usage {
cat <<\EOF
Usage:
```
PROJECT_ID=<your-gcp-project-id> CLUSTER_NAME=<your-gke-cluster-name> NAMESPACE=<your-k8s-namespace> ./gcp-workload-identity-setup.sh
```
PROJECT_ID: GCP project ID your cluster belongs to.
CLUSTER_NAME: your GKE cluster's name.
NAMESPACE: Kubernetes namespace your Kubeflow Pipelines standalone deployment belongs to (default is kubeflow).
EOF
}
if [ -z "$PROJECT_ID" ]; then
usage
echo
echo "Error: PROJECT_ID env variable is empty!"
exit 1
fi
if [ -z "$CLUSTER_NAME" ]; then
usage
echo
echo "Error: CLUSTER_NAME env variable is empty!"
exit 1
fi
echo "Env variables set:"
echo "* PROJECT_ID=$PROJECT_ID"
echo "* CLUSTER_NAME=$CLUSTER_NAME"
echo "* NAMESPACE=$NAMESPACE"
echo

read -p "Continue? (Y/n) " -n 1 -r
echo
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
exit 0
fi

echo "Creating Google service accounts..."
function create_gsa_if_not_present {
local name=${1}
local already_present=$(gcloud iam service-accounts list --filter='name:'$name'' --format='value(name)')
if [ -n "$already_present" ]; then
echo "Service account $name already exists"
else
gcloud iam service-accounts create $name
fi
}
create_gsa_if_not_present $SYSTEM_GSA
create_gsa_if_not_present $USER_GSA

# You can optionally choose to add iam policy bindings to grant project permissions to these GSAs.
# You can also set these up later.
# gcloud projects add-iam-policy-binding $PROJECT_ID \
# --member="serviceAccount:$SYSTEM_GSA@$PROJECT_ID.iam.gserviceaccount.com" \
# --role="roles/editor"
# gcloud projects add-iam-policy-binding $PROJECT_ID \
# --member="serviceAccount:$USER_GSA@$PROJECT_ID.iam.gserviceaccount.com" \
# --role="roles/editor"

# Bind KSA to GSA through workload identity.
# Documentation: https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity
function bind_gsa_and_ksa {
local gsa=${1}
local ksa=${2}

gcloud iam service-accounts add-iam-policy-binding $gsa@$PROJECT_ID.iam.gserviceaccount.com \
--member="serviceAccount:$PROJECT_ID.svc.id.goog[$NAMESPACE/$ksa]" \
--role="roles/iam.workloadIdentityUser" \
> /dev/null # hide verbose output
kubectl annotate serviceaccount \
--namespace $NAMESPACE \
--overwrite \
$ksa \
iam.gke.io/gcp-service-account=$gsa@$PROJECT_ID.iam.gserviceaccount.com
echo "* Bound KSA $ksa to GSA $gsa"
}

echo "Binding each kfp system KSA to $SYSTEM_GSA"
for ksa in ${SYSTEM_KSA[@]}; do
bind_gsa_and_ksa $SYSTEM_GSA $ksa
done

echo "Binding each kfp user KSA to $USER_GSA"
for ksa in ${USER_KSA[@]}; do
bind_gsa_and_ksa $USER_GSA $ksa
done
85 changes: 85 additions & 0 deletions manifests/kustomize/wi-utils.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
#!/bin/bash
#
# Copyright 2019 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

function create_gsa_if_not_present {
local name=${1}
local already_present=$(gcloud iam service-accounts list --filter='name:'$name'' --format='value(name)')
if [ -n "$already_present" ]; then
echo "Service account $name already exists"
else
gcloud iam service-accounts create $name
fi
}

# Bind KSA to GSA through workload identity.
# Documentation: https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity
function bind_gsa_and_ksa {
local gsa=${1}
local ksa=${2}
local project=${3:-$PROJECT_ID}
local gsa_full="$gsa@$project.iam.gserviceaccount.com"
local namespace=${4:-$NAMESPACE}

gcloud iam service-accounts add-iam-policy-binding $gsa_full \
--member="serviceAccount:$project.svc.id.goog[$namespace/$ksa]" \
--role="roles/iam.workloadIdentityUser" \
> /dev/null # hide verbose output
kubectl annotate serviceaccount \
--namespace $namespace \
--overwrite \
$ksa \
iam.gke.io/gcp-service-account=$gsa_full
echo "* Bound KSA $ksa in namespace $namespace to GSA $gsa_full"
}

# This can be used to programmatically verify workload identity binding grants corresponding GSA
# permissions successfully.
# Usage: verify_workload_identity_binding $KSA $NAMESPACE
#
# If you want to verify manually, use the following command instead:
# kubectl run test-$RANDOM --rm -it --restart=Never \
# --image=google/cloud-sdk:slim \
# --serviceaccount $ksa \
# --namespace $namespace \
# -- /bin/bash
# It connects you to a pod using specified KSA running an image with gcloud and gsutil CLI tools.
function verify_workload_identity_binding {
local ksa=${1}
local namespace=${2}
local max_attempts=10
local workload_identity_is_ready=false
for i in $(seq 1 ${max_attempts})
do
workload_identity_is_ready=true
kubectl run test-$RANDOM --rm -i --restart=Never \
--image=google/cloud-sdk:slim \
--serviceaccount $ksa \
--namespace $namespace \
-- gcloud auth list || workload_identity_is_ready=false
kubectl run test-$RANDOM --rm -i --restart=Never \
--image=google/cloud-sdk:slim \
--serviceaccount $ksa \
--namespace $namespace \
-- gsutil ls gs:// || workload_identity_is_ready=false
if [ "$workload_identity_is_ready" = true ]; then
break
fi
done
if [ ! "$workload_identity_is_ready" = true ]; then
echo "Workload identity bindings are not ready after $max_attempts attempts"
return 1
fi
}
Loading

0 comments on commit 4a8d262

Please sign in to comment.