Azure Infrastructure for the Clean Air project.
Provides 48h high-resolution air pollution forecasts over London via the UrbanAir-API.
Previously repurposed to assess busyness
in London during the COVID-19 pandemic - providing busyness data via the ip-whitelisted API.
- Installation
- Developer guide
- Docker guide
- Azure infrastructure
- Datasets
- Secret files
- SAS token
- Cheat sheet
- Contributors
See the old README file!
A list of key developers on the project. A good place to start if you wish to contribute.
Name | GitHub ID | Admin | |
---|---|---|---|
James Brandreth | @jamesbrandreth | jbrandreth@turing.ac.uk | |
Oscar Giles | @OscartGiles | ogiles@turing.ac.uk | Infrastructure, Prod Database, Kubernetes Cluster |
Oliver Hamelijnck | @defaultobject | ohamelijnck@turing.ac.uk | |
Chance Haycock | @chancehaycock | chaycock@turing.ac.uk | |
Christy Nakou | @ChristyNou | cnakou@turing.ac.uk | |
Patrick O'Hara | @PatrickOHara | pohara@turing.ac.uk | |
Harry Moss | @harryjmoss | h.moss@ucl.ac.uk | |
David Perez-Suarez | @dpshelio | d.perez-suarez@ucl.ac.uk | |
James Robinson | @jemrobinson | jrobinson@turing.ac.uk | Infrastructure, Prod Database, Kubernetes Cluster |
Tim Spain | @timspainUCL | t.spain@ucl.ac.uk | |
Edward Thorpe-Woods | @TeddyTW | ethorpe-woods@turing.ac.uk |
- Azure account
- Non-infrastructure dependencies
- Infrastructure dependencies
- Login to Azure
- Configure a local database
- Insert static datasets into local database
- Configure schema and database roles
To contribute to the Turing deployment of this project you will need to be on the Turing Institute's Azure active directory. In other words you will need a turing email address <someone>@turing.ac.uk
. If you do not have one already contact an infrastructure administrator.
If you are deploying the CleanAir infrastrucure elsewhere you should have access to an Azure account (the cloud-computing platform where the infrastructure is deployed).
To contribute as a non-infrastructure developer you will need the following:
Azure command line interface (CLI)
(for managing your Azure subscriptions)Docker
(For building and testing images locally)postgreSQL
(command-line tool for interacting with db)python
(Note thatpython>=3.8
is currently incompatible with some of our dependencies. We currently recommendpython==3.7.8
)CleanAir python packages
(install python packages)GDAL
(For inserting static datasets)eccodes
(For reading GRIB files)
The instructions below are to install the dependencies system-wide, however you can follow the instructions at the end if you wish to use an anaconda environment if you want to keep it all separated from your system.
Windows is not supported. However, you may use Windows Subsystem for Linux 2 and then install dependencies with conda.
If you have not already installed the command line interface for Azure
, please follow the procedure here
to get started
Or follow a simpler option
Install it using on your own preferred environment with `pip install azure-cli`Download and install Docker Desktop
PostgreSQL and PostGIS.
Setting up a local Postgres intance with PostGIS can be troublesome, so we recommend using a docker image.
docker run --name database -e POSTGRES_HOST_AUTH_METHOD=trust -p 5432:5432 cleanairdocker.azurecr.io/database
If you aren't logged in with access to the cleanairdocker registry, you can build the image yourself and run it with:
docker build -t database:latest -f ./containers/dockerfiles/test_database.dockerfile .
docker run --name database -e POSTGRES_HOST_AUTH_METHOD=trust -p 5432:5432 database
! If you have another Postgres install running, it will likely be using port 5432. In this case, use a different port number, for example to 5000 (Remember to change your local secrets file to match). Run instead with:
docker run --name database -e POSTGRES_HOST_AUTH_METHOD=trust -p 5000:5432 database
Alternatively, you can install Postgres with your package manager, such as Homebrew:
brew install postgresql postgis
GDAL can be installed using brew
on OSX.
brew install gdal
or any of the binaries provided for different platforms.
brew install eccodes
The following are optional as we can run everything on docker images. However, they are recommended for development/testing and required for setting up a local copy of the database.
pip install -r containers/requirements.txt
To run the CleanAir functionality locally (without a docker image) you can install the package with pip
.
For a basic install which will allow you to set up a local database run:
pip install -e 'containers/cleanair[<optional-dependencies>]'
Certain functionality requires optional dependencies. These can be installed by adding the following:
Option keyword | Functionality |
---|---|
models | CleanAir GPFlow models |
traffic | FBProphet Trafic Models |
For getting started we recommend:
pip install -e 'containers/cleanair[models, traffic]'
All additional functionality related to the London Busyness project requires:
pip install -e 'containers/odysseus'
pip install -e 'containers/urbanair'
Cloud infrastructure developers will require the following in addition to the non-infrastructure dependencies.
Access to the deployment Azure subscription
Terraform
(for configuring the Azure infrastructure)Travis Continuous Integration (CI) CLI
(for setting up automatic deployments)
You need to have access to the CleanAir Azure subscription to deploy infrastructure. If you need access contact an infrastructure administrator
The Azure infrastructure is managed with Terraform
. To get started download Terraform
from their website. If using Mac OS, you can instead use homebrew
:
brew install terraform
Ensure you have Ruby 1.9.3 or above installed:
brew install ruby
gem update --system
Then install the Travis CI CLI with:
gem install travis -no-rdoc -no-ri
On some versions of OSX, this fails, so you may need the following alternative:
ARCHFLAGS=-Wno-error=unused-command-line-argument-hard-error-in-future gem install --user-install travis -v 1.8.13 --no-document
Verify with
travis version
If this fails ensure Gems user_dir is on the path:
cat << EOF >> ~/.bash_profile
export PATH="\$PATH:$(ruby -e 'puts Gem.user_dir')/bin"
EOF
It is possible to set everything up with a conda environment, this way you can keep different versions of software around in your machine. All the steps above can be done with:
# Non-infrastructure dependencies
conda create -n busyness python=3.7.8 --channel conda-forge
conda activate busyness
conda install -c anaconda postgresql
conda install -c conda-forge gdal postgis uwsgi
pip install azure-cli
pip install azure-nspkg azure-mgmt-nspkg
# The following fails with: ERROR: azure-cli 2.6.0 has requirement azure-storage-blob<2.0.0,>=1.3.1, but you'll have azure-storage-blob 12.3.0 which is incompatible.
# but they install fine.
pip install -r containers/requirements.txt
pip install -e 'containers/cleanair[models,traffic]'
pip install -e 'containers/odysseus'
pip install -e 'containers/urbanair'
## Infrastructure dependencies
# if you don't get rb-ffi and rb-json you'll need to install gcc_linux-64 and libgcc to build these in order to install travis.
conda install -c conda-forge terraform ruby rb-ffi rb-json
# At least on Linux you'll need to dissable IPV6 to make this version of gem to work.
gem install travis -no-rdoc -no-ri
# Create a soft link of the executables installed by gem into a place seen within the conda env.
conda_env=$(conda info --json | grep -w "active_prefix" | awk '{print $2}'| sed -e 's/,//' -e 's/"//g')
ln -s $(find $conda_env -iname 'travis' | grep bin) $conda_env/bin/
To start working with Azure
, you must first login to your account from the terminal:
az login
Infrastructure developers should additionally check which Azure
subscriptions you have access to by running
az account list --output table --refresh
Then set your default subscription to the Clean Air project (if you cannot see it in the output generated from the last line you do not have access):
az account set --subscription "CleanAir"
If you don't have access this is ok. You only need it to deploy and manage infrastructure.
In production we use a managed PostgreSQL database. However, it is useful to have a local copy to run tests and for development. To set up a local version start a local postgres server:
brew services start postgresql
If you installed the database using conda
Set it up the server and users first with:
initdb -D mylocal_db
pg_ctl -D mylocal_db -l logfile start
createdb --owner=${USER} myinner_db
When you want to work in this environment again you'll need to run:
pg_ctl -D mylocal_db -l logfile start
You can stop it with:
pg_ctl -D mylocal_db stop
We store database credentials in json files. For production databases you should never store database passwords in these files - for more information see the production database section.
mkdir -p .secrets
echo '{
"username": "postgres",
"password": "''",
"host": "localhost",
"port": 5432,
"db_name": "cleanair_test_db",
"ssl_mode": "prefer"
}' >> .secrets/.db_secrets_offline.json
N.B In some cases your default username may be your OS user. Change the username in the file above if this is the case.
createdb cleanair_test_db
We must now setup the database schema. This also creates a number of roles on the database.
Create a variable with the location of your secrets file and set as an environment variable
export DB_SECRET_FILE=$(pwd)/.secrets/.db_secrets_offline.json
python containers/entrypoints/setup/configure_db_roles.py -s $DB_SECRET_FILE -c configuration/database_role_config/local_database_config.yaml
The database requires a number of static datasets. We can now insert static data
into our local database. You will need a SAS token to access static data files stored on Azure.
If you have access Azure you can log in to Azure from the command line and run the following to obtain a SAS token:
SAS_TOKEN=$(python containers/entrypoints/setup/insert_static_datasets.py generate)
By default the SAS token will last for 1 hour. If you need a longer expiry time pass --days
and --hours
arguments to the program above. N.B. It's better to use short expiry dates where possible.
Otherwise you must request a SAS token from an infrastructure developer and set it as a variable:
SAS_TOKEN=<SAS_TOKEN>
You can then download and insert all static data into the database by running the following:
python containers/entrypoints/setup/insert_static_datasets.py insert -t $SAS_TOKEN -s $DB_SECRET_FILE -d rectgrid_100 street_canyon hexgrid london_boundary oshighway_roadlink scoot_detector urban_village
If you would also like to add UKMAP
to the database run:
python containers/entrypoints/setup/insert_static_datasets.py insert -t $SAS_TOKEN -s $DB_SECRET_FILE -d ukmap
UKMAP
is extremly large and will take ~1h to download and insert. We therefore do not run tests against UKMAP
at the moment.
N.B. SAS tokens will expire after a short length of time, after which you will need to request a new one.
You can check everything configured correctly by running:
pytest containers/tests/test_database_init --secretfile $DB_SECRET_FILE
To access the production database you will need an Azure account and be given access by one of the database adminstrators. You should discuss what your access requirements are (e.g. do you need write access).To access the database first login to Azure from the terminal.
If you do not have an Azure subscription you must use:
az login --allow-no-subscriptions
You can then request an access token. The token will be valid for between 5 minutes and 1 hour. Set the token as an environment variable:
export PGPASSWORD=$(az account get-access-token --resource-type oss-rdbms --query accessToken -o tsv)
Once your IP has been whitelisted (ask the database adminstrators), you will be able to access the database using psql:
psql "host=cleanair-inputs-2021-server.postgres.database.azure.com port=5432 dbname=cleanair_inputs_db user=<your-turing-credentials>@cleanair-inputs-2021-server sslmode=require"
replacing <your-turing-credentials>
with your turing credentials (e.g. jblogs@turing.ac.uk
).
To connect to the database using the CleanAir package you will need to create another secret file:
echo '{
"username": "<your-turing-credentials>@cleanair-inputs-2021-server",
"host": "cleanair-inputs-2021-server.postgres.database.azure.com",
"port": 5432,
"db_name": "cleanair_inputs_db",
"ssl_mode": "require"
}' >> .secrets/db_secrets_ad.json
Make sure you then replace <your-turing-credentials>
with your full Turing username (e.g.jblogs@turing.ac.uk@cleanair-inputs-2021-server
).
The directory containers/entrypoints contains Python scripts which are then built into Docker images in containers/dockerfiles. You can run them locally.
These are scripts which collect and insert data into the database. To see what arguments they take you can call any of the files with the argument -h
, for example:
python containers/entrypoints/inputs/input_laqn_readings.py -h
The entrypoints will need to connect to a database. To do so you can pass one or more of the following arguments:
-
--secretfile
: Full path to one of the secret .json files you created in the.secrets
directory. -
--secret-dict
: A set of parameters to override the values in--secretfile
. For example you could alter the port and ssl parameters as--secret-dict port=5411 ssl_mode=prefer
You will notice that the db_secrets_ad.json
file we created does not contain a password. To run an entrypoint against a production database you must run:
az login
export PGPASSWORD=$(az account get-access-token --resource-type oss-rdbms --query accessToken -o tsv)
When you run an entrypoint script the CleanAir package will read the PGPASSWORD
environment variable. This will also take precedence over any value provided in the--secret-dict
argument.
To run an entry point from a docker file we first need to build a docker image. Here shown for the satellite input entry point:
docker build -t input_satellite:local -f containers/dockerfiles/input_satellite_readings.Dockerfile containers
To run we need to set a few more environment variables. The first is the directory with secret files in:
SECRET_DIR=$(pwd)/.secrets
Now get a new token:
export PGPASSWORD=$(az account get-access-token --resource-type oss-rdbms --query accessToken -o tsv)
Finally you can run the docker image, passing PGPASSWORD as an environment variable (:warning: this writes data into the online database)
docker run -e PGPASSWORD -v $SECRET_DIR:/secrets input_satellite:local -s 'db_secrets_ad.json' -k <copernicus-key>
Here we also provided the copernicus api key which is stored in the cleanair-secrets
Azure's keyvault.
If you want to run that example with the local database you can do so by:
COPERNICUS_KEY=$(az keyvault secret show --vault-name cleanair-secrets --name satellite-copernicus-key -o tsv --query value)
# OSX or Windows: change "localhost" to host.docker.internal on your db_secrets_offline.json
docker run -e PGPASSWORD -v $SECRET_DIR:/secrets input_satellite:local -s 'db_secrets_offline.json' -k $COPERNICUS_KEY
# Linux:
docker run --network host -e PGPASSWORD -v $SECRET_DIR:/secrets input_satellite:local -s 'db_secrets_offline.json' -k $COPERNICUS_KEY
The UrbanAir RESTFUL API is a Fast API application. To run it in locally you must configure the following steps:
Ensure you have configured a secrets file for the CleanAir database
export PGPASSWORD=$(az account get-access-token --resource-type oss-rdbms --query accessToken -o tsv)
DB_SECRET_FILE=$(pwd)/.secrets/.db_secrets_ad.json uvicorn urbanair.urbanair:app --reload
To build the API docker image
docker build -t fastapi:test -f containers/dockerfiles/urbanairapi.Dockerfile 'containers'
Then run the docker image:
DB_SECRET_FILE='.db_secrets_ad.json'
SECRET_DIR=$(pwd)/.secrets
docker run -i -p 80:80 -e DB_SECRET_FILE -e PGPASSWORD -e APP_MODULE="urbanair.urbanair:app" -v $SECRET_DIR:/secrets fastapi:test
Before being accepted into master all code should have well writen documentation.
Please use Google Style Python Docstrings
We would like to move towards adding type hints so you may optionally add types to your code. In which case you do not need to include types in your google style docstrings.
Adding and updating existing documentation is highly encouraged.
We like gitmoji for an emoji guide to our commit messages. You might consider (entirely optional) using the gitmoji-cli as a hook when writing commit messages.
The general workflow for contributing to the project is to first choose and issue (or create one) to work on and assign yourself to the issues.
You can find issues that need work on by searching by the Needs assignment
label. If you decide to move onto something else or wonder what you've got yourself into please unassign yourself, leave a comment about why you dropped the issue (e.g. got bored, blocked by something etc) and re-add the Needs assignment
label.
You are encouraged to open a pull request earlier rather than later (either a draft pull request
or add WIP
to the title) so others know what you are working on.
How you label branches is optional, but we encourage using iss_<issue-number>_<description_of_issue>
where <issue-number>
is the github issue number and <description_of_issue>
is a very short description of the issue. For example iss_928_add_api_docs
.
Tests should be written where possible before code is accepted into master. Contributing tests to existing code is highly desirable. Tests will also be run on travis (see the travis configuration).
All tests can be found in the containers/tests/
directory. We already ran some tests to check our local database was set up.
To run the full test suite against the local database run
export DB_SECRET_FILE=$(pwd)/.secrets/.db_secrets_offline.json
pytest containers --secretfile $DB_SECRET_FILE
The following shows an example test:
def test_scoot_reading_empty(secretfile, connection):
conn = DBWriter(
secretfile=secretfile, initialise_tables=True, connection=connection
)
with conn.dbcnxn.open_session() as session:
assert session.query(ScootReading).count() == 0
It uses the DBWriter
class to connect to the database. In general when interacting with a database we write a class which inherits from either DBWriter
or DBReader
. Both classes take a secretfile
as an argument which provides database connection secrets.
Critically, we also pass a special connection
fixture when initialising any class that interacts with the database.
This fixture ensures that all interactions with the database take place within a transaction
. At the end of the test the transaction is rolled back leaving the database in the same state it was in before the test was run, even if commit
is called on the database.
The following steps provide useful tools for researchers to use, for example setting up jupyter notebooks and running models using a GPU.
First install jupyter with conda (you can also use pip).
pip install jupyter
You can start the notebook:
jupyter notebook
Alternatively you may wish to use jupyter lab which offers more features on top of the normal notebooks.
jupyter lab
This will require some additional steps for adding jupyter lab extensions for plotly.
For some notebooks you may also want to a mapbox for visualising spatial data. To do this you will need a mapbox access token which you can store inside your .env
file (see below).
To access the database, the notebooks need access to the PGPASSWORD
environment variable.
It is also recommended to set the DB_SECRET_FILE
variable.
We will create a .env
file within you notebook directory path/to/notebook
where you will be storing environment variables.
Note: if you are using a shared system or scientific cluster, do not follow these steps and do not store your password in a file.
Run the below command to create a .env
file, replacing path/to/secretfile
with the path to your db_secrets
.
echo '
DB_SECRET_FILE="path/to/secretfile"
PGPASSWORD=
' > path/to/notebook/.env
To set the PGPASSWORD
, run the following command.
This will create a new password using the azure cli and replace the line in .env
that contains PGPASSWORD
with the new password.
Remember to replace path/to/notebook
with the path to your notebook directory.
sed -i '' "s/.*PGPASSWORD.*/PGPASSWORD=$(az account get-access-token --resource-type oss-rdbms --query accessToken -o tsv)/g" path/to/notebook/.env
If you need to store other environment variables and access them in your notebook, simply add them to the .env
file.
To access the environment variables, include the following lines at the top of your jupyter notebook:
%load_ext dotenv
%dotenv
You can now access the value of these variables as follows:
secretfile = os.getenv("DB_SECRET_FILE", None)
Remember that the PGPASSWORD
token will only be valid for ~1h.
To train a model on your local machine you can run a model fitting entrypoint:
TL;DR
urbanair init production
urbanair model data generate-config --train-source laqn --train-source satellite --pred-source laqn
urbanair model data generate-full-config
urbanair model data download --training-data --prediction-data
urbanair model setup mrdgp
urbanair model fit mrdgp
urbanair model update result mrdgp
urbanair model update metrics INSTANCE_ID
urbanair model data generate-config --train-source laqn --train-source satellite --pred-source satellite --pred-source laqn --pred-source hexgrid
urbanair model data generate-full-config
urbanair model data download --training-data --prediction-data
urbanair model data save-cache <data-dir-name>
urbanair model svgp fit <data-directory>
or for deep gp
urbanair model deep-gp fit <data-directory>
Build a model fitting docker image with tensorflow installed:
docker build --build-arg git_hash=$(git show -s --format=%H) -t cleanairdocker.azurecr.io/model_fitting -f containers/dockerfiles/model_fitting.Dockerfile containers
Alternatively you can pull the docker image if you haven't made any changes:
docker pull cleanairdocker.azurecr.io/model_fitting
To fit and predict using the SVGP you can run:
docker run -it --rm cleanairdocker.azurecr.io/model_fitting:latest sh /app/scripts/svgp_static.sh
To fit and predict using the MRDGP run:
docker run -it --rm cleanairdocker.azurecr.io/model_fitting:latest sh /app/scripts/mrdgp_static.sh
If you are running on your local machine you will also need to add -e PGPASSWORD -e DB_SECRET_FILE -v $SECRET_DIR:/secrets
after the run
command and set the environment variables (see above in the README).
Many scientific clusters will give you access to Singularity. This software means you can import and run Docker images without having Docker installed or being a superuser. Scientific clusters are often a pain to setup, so we strongly recommend using Singularity & Docker to avoid a painful experience.
First login to your HPC and ensure singularity is installed:
singularity --version
Now we will need to pull the Docker image from our Docker container registry on Azure. Since our docker images are private you will need to login to the container registry.
- Go to portal.azure.com.
- Search for the
CleanAirDocker
container registry. - Go to
Access keys
. - The username is
CleanAirDocker
. Copy the password.
singularity pull --docker-login docker://cleanairdocker.azurecr.io/mf:latest
Then build the singularity image to a .sif
file.
We recommend you store all of your singularity images in a directory called containers
.
singularity build --docker-login containers/model_fitting.sif docker://cleanairdocker.azurecr.io/mf:latest
To test everything has built correctly, spawn a shell and run python:
singularity shell containers/model_fitting.sif
python
Then try importing tensorflow and cleanair:
import tensorflow as tf
tf.__version__
import cleanair
cleanair.__version__
Finally your can run the singularity image, passing any arguments you see fit:
singularity run containers/model_fitting.sif --secretfile $SECRETS
đź’€ The following steps are needed to setup the Clean Air cloud infrastructure. Only infrastrucure administrator should deploy
Login to Travis with your github credentials, making sure you are in the Clean Air repository (Travis automatically detects your repository):
travis login --pro
Create an Azure service principal using the documentation for the Azure CLI or with Powershell, ensuring that you keep track of the NAME
, ID
and PASSWORD/SECRET
for the service principal, as these will be needed later.
Terraform
uses a backend to keep track of the infrastructure state.
We keep the backend in Azure
storage so that everyone has a synchronised version of the state.
You can download the `tfstate` file with `az` though you won't need it.
cd terraform
az storage blob download -c terraformbackend -f terraform.tfstate -n terraform.tfstate --account-name terraformstorage924roouq --auth-mode key
To enable this, we have to create an initial Terraform
configuration by running (from the root directory):
python cleanair_setup/initialise_terraform.py -i $AWS_KEY_ID -k $AWS_KEY -n $SERVICE_PRINCIPAL_NAME -s $SERVICE_PRINCIPAL_ID -p $SERVICE_PRINCIPAL_PASSWORD
Where AWS_KEY_ID
and AWS_KEY
are the secure key information needed to access TfL's SCOOT data on Amazon Web Services.
AWS_KEY=$(az keyvault secret show --vault-name terraform-configuration --name scoot-aws-key -o tsv --query value)
AWS_KEY_ID=$(az keyvault secret show --vault-name terraform-configuration --name scoot-aws-key-id -o tsv --query value)
The SERVICE_PRINCIPAL_NAME
, _ID
and _PASSWORD
are also available in the terraform-configuration
keyvault.
SERVICE_PRINCIPAL_NAME=$(az keyvault secret show --vault-name terraform-configuration --name azure-service-principal-name -o tsv --query value)
SERVICE_PRINCIPAL_ID=$(az keyvault secret show --vault-name terraform-configuration --name azure-service-principal-id -o tsv --query value)
SERVICE_PRINCIPAL_PASSWORD=$(az keyvault secret show --vault-name terraform-configuration --name azure-service-principal-password -o tsv --query value)
This will only need to be run once (by anyone), but it's not a problem if you run it multiple times.
To build the Terraform
infrastructure go to the terraform
directory
cd terraform
and run:
terraform init
If you want to, you can look at the backend_config.tf
file, which should contain various details of your Azure
subscription.
NB. It is important that this file is in .gitignore
. Do not push this file to the remote repository
Then run:
terraform plan
which creates an execution plan. Check this matches your expectations. If you are happy then run:
terraform apply
to set up the Clean Air infrastructure on Azure
using Terraform
. You should be able to see this on the Azure
portal.
Terraform created a DNS Zone in the kubernetes cluster resource group (RG_CLEANAIR_KUBERNETES_CLUSTER
). Navigate to the DNS Zone on the Azure portal and copy the four nameservers in the “NS” record. Send the nameserver to Turing IT Services. Ask them to add the subdomain’s DNS record as an NS record for urbanair
in the turing.ac.uk
DNS zone record.
- When viewing the DNS zone on the Azure Portal, click
+ Record set
- In the Name field, enter
urbanair
. - Set Alias record set to “Yes” and this will bring up some new options.
- We can now set up Azure pipelines. Once the cleanair api has been deployed on kubernetes you can update the alias record to point to the ip address of the cleanair-api on the cluster.
Terraform will now have created a number of databases. We need to add the datasets to the database.
This is done using Docker images from the Azure container registry.
You will need the username, password and server name for the Azure container registry.
All of these will be stored as secrets in the RG_CLEANAIR_INFRASTRUCTURE > cleanair-secrets
Azure KeyVault.
These Docker images are built by an Azure pipeline whenever commits are made to the master branch of the GitHub repository.
Ensure that you have configured Azure pipelines to use this GitHub repository.
You will need to add Service Connections to GitHub and to Azure (the Azure one should be called cleanair-scn
).
Currently a pipeline is set up here.
To run the next steps we need to ensure that this pipeline runs a build in order to add the Docker images to the Azure container registry created by Terraform.
Either push to the GitHub repository, or rerun the last build by going to the Azure pipeline page and clicking Run pipeline
on the right-hand context menu.
This will build all of the Docker images and add them to the registry.
Now go to Azure and update the A-record to point to the ip address of the cleanair-api on the cluster.
To add static datasets follow the Static data insert instructions but use the production database credentials
The live datasets (like LAQN or AQE) are populated using regular jobs that create an Azure container instance and add the most recent data to the database. These are run automatically through Kubernetes and the Azure pipeline above is used to keep track of which version of the code to use.
The azure pipeline will deploy the cleanair helm chart to the azure kubernetes cluster we deployed with terraform. If you deployed GPU enabled machines on Azure (current default in the terraform script) then you need to install the nvidia device plugin daemonset. The manifest for this is adapted from the Azure docs. However, as our GPU machines have taints applied we have to add tolerations to the manifest, otherwise the nodes will block the daemonset. To install the custom manifest run,
kubectl apply -f kubernetes/gpu_resources/nvidia-device-plugin-ds.yaml
To destroy all the resources created by Terraform
run:
terraform destroy
You can check everything was removed on the Azure portal. Then login to TravisCI and delete the Azure Container repo environment variables.