Skip to content

matthewfaw/mixnmatch-infrastructure

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mix&Match Infrastructure

A project to manage setting up the infrastructure to build, run, monitor, analyze, and reproduce experiments.

This is the code used to create the infrastructure to run experiments associated with our Mix&Match paper:

  • The associated code to run experiments can be found in the other folder

Please cite the above paper if this code is used in any publication.

How-To

More specifically, this project is used to setup a Kubernetes cluster running on Google Cloud that uses:

  • Jenkins to build the experiment code
  • Kubeflow to easily tune hyperparameters in parallel using Katib
  • The Kubernetes Dashboard to interact with the cluster in an easy way
  • Grafana/Prometheus to provide fine-grained memory usage monitoring

The project assumes that the following environment variables are set:

  • JENKINS_ADMIN_PW: The base64 encoded password you'd like to use for Jenkins (user is admin)
  • GRAFANA_ADMIN_PW: The base64 encoded password you'd like to use for Grafana (user is admin)
  • JENKINS_BACKUP_BUCKET: The Google cloud storage bucket where Jenkins backups will be placed/already exist. e.g. gs://derp/backups
  • GKE_CLUSTER_NAME: The desired name of the GKE cluster that will be created
  • GCLOUD_DATASET_BUCKET: The base bucket name where experiment datasets/results are stored
  • GIT_REPO_SSH: The git repo ssh e.g. git@github.com:{username}/{project}.git
  • GIT_BRANCH: The git branch to pull from

If these environment variables aren't set, random ones will be generated and stored as k8s secrets.

In order to set up the environment, run the following command:

./setup.sh <GCLOUD_SVC_ACCOUNT_FILE> <KAGGLE_CREDS_FILE> <GIT_PRIVATE_CREDS_FILE> <GIT_KNOWN_HOSTS_FILE> <EXISTING_JENKINS_PVC>

where each of the arguments correspond to:

  • <GCLOUD_SVC_ACCOUNT_FILE>: The path the gcloud service account file which has read/write permissions to google cloud storage and google container service
  • <KAGGLE_CREDS_FILE>: The path to the Kaggle credientials file with access to all datasets you'd like to download
  • <GIT_PRIVATE_CREDS_FILE>: The path to the git private credentials file with access to the Github account you'd like to use
  • <GIT_KNOWN_HOSTS_FILE>: The path to the corresponding git known hosts file
  • <EXISTING_JENKINS_PVC>: The name of the existing jenkins persistent volume claim, if one exists. Leave blank if you'd like to create a new one

To open the k8s dashboard, run:

./open_dashboard.sh

To get a token to open the k8s dashboard, run:

./get_dashboard_token.sh

and the token can then be pasted into the login page.

To setup a jupyter deployment with the experiment data, run

./setup_jupyter.sh <DATASET_ID> <NOTEBOOK_ID> <EXPERIMENT_ID>

and the jupyter notebook with the associated data will be available by running:

kubectl port-forward svc/jupyter-notebook-<DATASET_ID> 8988

and opening localhost:8988 in a web browser.

License

This project is licensed under the terms of the Apache 2.0 License.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published