Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deploy browser site for HCA dev as TF component of Azul (#6343) #6565

Merged
merged 7 commits into from
Sep 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 34 additions & 0 deletions OPERATOR.rst
Original file line number Diff line number Diff line change
Expand Up @@ -569,6 +569,40 @@ backport PR first. The new PR will include the changes from the old one.

.. _#team-boardwalk Slack channel: https://ucsc-gi.slack.com/archives/C705Y6G9Z


Deploying the Data Browser
^^^^^^^^^^^^^^^^^^^^^^^^^^

The Data Browser is deployed two steps. The first step is building the
``ucsc/data-browser`` project on GitLab. This is initiated by pushing a branch
whose name matches ``ucsc/*/*`` to one of our GitLab instances. The resulting
pipeline produces a tarball stored in the package registry on that GitLab
instance. The second step is running the ``deploy_browser`` job of the
``ucsc/azul`` project pipeline on that same instance. This job creates or
updates the necessary cloud infrastructure (CloudFront, S3, ACM, Route 53),
downloads the tarball from the package registry and unpacks that tarball to the
S3 bucket backing the Data Browser's CloudFront distribution.

Typically, CC requests the deployment of a Data Browser instance on Slack,
specifying the commit they wish to be deployed. After the system administrator
approves that request, the operator merges the specified commit into one of the
``ucsc/{atlas}/{deployment}`` branches and then pushes that branch to the
``DataBiosphere/data-browser`` project on GitHub, and the ``ucsc/data-browser``
project on the GitLab instance for the Azul ``{deployment}`` that backs the Data
Browser instance to be deployed. For the merge commit title, SmartGit's default
can be used, as long as the title reflects the commit (branch, tag, or sha1)
specified by CC.

The ``{atlas}`` placeholder can be ``hca``, ``anvil`` or ``lungmap``. Not all
combinations of ``{atlas}`` and ``{deployment}`` are valid. Valid combinations
are ``ucsc/anvil/anvildev``, ``ucsc/anvil/anvilprod``, ``ucsc/hca/dev``,
``ucsc/hca/prod``, ``ucsc/lungmap/dev`` or ``ucsc/lungmap/prod``, for example.
The ``ucsc/data-browser`` pipeline on GitLab blindly builds any branch, but
Azul's ``deploy_browser`` job is configured to only use the tarball from exactly
one branch (see ``deployments/*.browser/environment.py``) and it will always use
the tarball from the most recent pipeline on that branch.


Troubleshooting
---------------

Expand Down
17 changes: 8 additions & 9 deletions deployments/anvildev.browser/environment.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,15 +28,14 @@ def env() -> Mapping[str, Optional[str]]:
return {
'azul_terraform_component': 'browser',
'azul_browser_sites': json.dumps({
'ucsc/data-browser': {
'main': {
'anvil': {
'domain': '{AZUL_DOMAIN_NAME}',
'bucket': 'browser',
'tarball_path': 'out',
'real_path': ''
}
}
'browser': {
'zone': '{AZUL_DOMAIN_NAME}',
'domain': '{AZUL_DOMAIN_NAME}',
'project': 'ucsc/data-browser',
'branch': 'ucsc/anvil/anvildev',
'tarball_name': 'anvil',
'tarball_path': 'out',
'real_path': ''
}
})
}
17 changes: 8 additions & 9 deletions deployments/anvilprod.browser/environment.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,15 +28,14 @@ def env() -> Mapping[str, Optional[str]]:
return {
'azul_terraform_component': 'browser',
'azul_browser_sites': json.dumps({
'ucsc/data-browser': {
'main': {
'anvil': {
'domain': '{AZUL_DOMAIN_NAME}',
'bucket': 'browser',
'tarball_path': 'out',
'real_path': ''
}
}
'browser': {
'zone': '{AZUL_DOMAIN_NAME}',
'domain': '{AZUL_DOMAIN_NAME}',
'project': 'ucsc/data-browser',
'branch': 'ucsc/anvil/anvilprod',
'tarball_name': 'anvil',
'tarball_path': 'out',
'real_path': ''
}
})
}
41 changes: 41 additions & 0 deletions deployments/dev.browser/environment.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
from collections.abc import (
Mapping,
)
import json
from typing import (
Optional,
)


def env() -> Mapping[str, Optional[str]]:
"""
Returns a dictionary that maps environment variable names to values. The
values are either None or strings. String values can contain references to
other environment variables in the form `{FOO}` where FOO is the name of an
environment variable. See

https://docs.python.org/3.11/library/string.html#format-string-syntax

for the concrete syntax. These references will be resolved *after* the
overall environment has been compiled by merging all relevant
`environment.py` and `environment.local.py` files.

Entries with a `None` value will be excluded from the environment. They
can be used to document a variable without a default value in which case
other, more specific `environment.py` or `environment.local.py` files must
provide the value.
"""
return {
'azul_terraform_component': 'browser',
'azul_browser_sites': json.dumps({
'browser': {
'zone': '{AZUL_DOMAIN_NAME}',
'domain': 'explore.{AZUL_DOMAIN_NAME}',
'project': 'ucsc/data-browser',
'branch': 'ucsc/hca/dev',
'tarball_name': 'hca',
'tarball_path': 'out',
'real_path': ''
}
})
}
17 changes: 8 additions & 9 deletions deployments/tempdev.browser/environment.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,15 +28,14 @@ def env() -> Mapping[str, Optional[str]]:
return {
'azul_terraform_component': 'browser',
'azul_browser_sites': json.dumps({
'ucsc/data-browser': {
'main': {
'anvil': {
'domain': '{AZUL_DOMAIN_NAME}',
'bucket': 'browser',
'tarball_path': 'out',
'real_path': ''
}
}
'browser': {
'zone': '{AZUL_DOMAIN_NAME}',
'domain': '{AZUL_DOMAIN_NAME}',
'project': 'ucsc/data-browser',
'branch': 'ucsc/anvil/tempdev',
'tarball_name': 'anvil',
'tarball_path': 'out',
'real_path': ''
}
})
}
2 changes: 1 addition & 1 deletion deployments/tempdev/environment.py
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@ def env() -> Mapping[str, Optional[str]]:

'GOOGLE_PROJECT': 'platform-temp-dev',

'AZUL_DEPLOYMENT_INCARNATION': '0',
'AZUL_DEPLOYMENT_INCARNATION': '1',

'AZUL_GOOGLE_OAUTH2_CLIENT_ID': '807674395527-erth0gf1m7qme5pe6bu384vpdfjh06dg.apps.googleusercontent.com',
}
30 changes: 30 additions & 0 deletions environment
Original file line number Diff line number Diff line change
Expand Up @@ -320,6 +320,36 @@ _update_clone() {
.
}

# Destroy the most expensive resources in a main deployment and its components.
# Compared to complete destruction, hibernation has the advantage of needing
# less time to come back from, and not requiring incrementing the incarnation
# counter or adding service accounts to Terra groups. This function was written
# from memory. The `terraform` commands were tested individually but the
# function as a whole was not.
#
_hibernate() {
# shellcheck disable=SC2154
if test -z "$azul_terraform_component"; then
make -C lambdas && {
cd terraform &&
make validate
terraform destroy -target aws_elasticsearch_domain.index && {
cd gitlab &&
_select "$AZUL_DEPLOYMENT_STAGE.gitlab" &&
make validate &&
terraform destroy \
-target aws_ec2_client_vpn_endpoint.gitlab \
-target aws_instance.gitlab \
-target aws_nat_gateway.gitlab_0 \
-target aws_nat_gateway.gitlab_1
}
}
else
echo "Must have main component selected"
return 1
fi
}

# We disable `envhook.py` to avoid redundancy. The `envhook.py` script imports
# `export_environment.py`, too. We could also pass -S to `python3` but that
# causes problems on Travis (`importlib.util` failing to import `contextlib`).
Expand Down
70 changes: 38 additions & 32 deletions environment.py
Original file line number Diff line number Diff line change
Expand Up @@ -806,38 +806,44 @@ def env() -> Mapping[str, Optional[str]]:
# managed by the `browser` TF component of the current Azul deployment.
#
# {
# 'ucsc/data-browser': { // The path of the GitLab project hosting
# // the source code for the site. The
# // project must exist on the GitLab
# // instance managing the current Azul
# // deployment.
#
# 'main': { // The name of the branch (in that project) from
# // which the site's content tarball was built
#
# 'anvil': { // The site name. Typically corresponds to an
# // Azul atlas as defined in the AZUL_CATALOGS
# // and a child directory of
# // .gitlab/sites/$AZUL_DEPLOYMENT_STAGE in the
# // source of the project referenced by the
# // top-level key in this structure.
#
# 'domain': '{AZUL_DOMAIN_NAME}', // The domain name of
# // the site
#
# 'bucket': 'browser', // The TF resource name (in the
# // `browser` component) of the
# // S3 bucket hosting the site
# // ('portal' or 'browser')
#
# 'tarball_path': 'explore', // The path to the site's
# // content in the tarball
#
# 'real_path': 'explore/anvil-cmg' // The path of that
# // same content in
# // the bucket
# }
# }
# 'browser': { // The TF resource name of per-site resources in the
# // `browser` component and unqualified name of the
# // S3 bucket hosting the site
#
# 'domain': '{AZUL_DOMAIN_NAME}', // The domain name of the
# // site
#
# 'zone': '{AZUL_DOMAIN_NAME}', // The name of the Route53
# // hosted zone containing the
# // A record for the domain name
# // of the site. The zone must
# // already exist before the
#
# 'project': 'ucsc/data-browser', // The path of the GitLab
# // project hosting the source
# // code for the site. The
# // project must exist on the
# // GitLab instance managing
# // the current Azul
# // deployment.
#
# 'branch': 'main', // The name of the branch (in that project)
# // from which the site's content tarball was
# // built
#
# 'tarball_name': 'anvil' // Typically corresponds to an Azul
# // atlas as defined in AZUL_CATALOGS
# // and a child directory of
# // .gitlab/sites/$AZUL_DEPLOYMENT_STAGE
# // in the source of the project
# // referenced by the top-level key in
# // this structure.
#
# 'tarball_path': 'explore', // The path to the site's content
# // in the tarball
#
# 'real_path': 'explore/anvil-cmg' // The path of that same
# // content in the bucket
# }
# }
#
Expand Down
16 changes: 1 addition & 15 deletions scripts/rename_resources.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
import argparse
import logging
import subprocess
import sys
from typing import (
Optional,
Expand All @@ -25,19 +24,6 @@
}


def terraform_state_list() -> list[str]:
try:
output = terraform.run('state', 'list')
except subprocess.CalledProcessError as e:
if e.returncode == 1 and 'No state file was found' in e.stderr:
log.info('No state file was found, assuming empty list of resources.')
return []
else:
raise
else:
return output.splitlines()


def main(argv: list[str]):
configure_script_logging(log)
parser = argparse.ArgumentParser(description=__doc__,
Expand All @@ -49,7 +35,7 @@ def main(argv: list[str]):
args = parser.parse_args(argv)

if renamed:
current_names = terraform_state_list()
current_names = terraform.run_state_list()
for current_name in current_names:
try:
new_name = renamed[current_name]
Expand Down
8 changes: 5 additions & 3 deletions src/azul/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -1111,14 +1111,16 @@ def shared_deployments_for_branch(self,
return None if branch is None else deployments.get(None)

class BrowserSite(TypedDict):
zone: str
domain: str
bucket: str
project: str
branch: str
tarball_name: str
tarball_path: str
real_path: str

@property
def browser_sites(self
) -> Mapping[str, Mapping[str, Mapping[str, BrowserSite]]]:
def browser_sites(self) -> Mapping[str, BrowserSite]:
import json
return json.loads(self.environ['azul_browser_sites'])

Expand Down
8 changes: 5 additions & 3 deletions src/azul/terraform.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,15 +96,17 @@ def run(self, *args: str, **kwargs) -> str:
**kwargs)
return cmd.stdout

def run_state_list(self):
def run_state_list(self) -> list[str]:
try:
stdout = self.run('state', 'list', stderr=subprocess.PIPE)
return stdout.splitlines()
except subprocess.CalledProcessError as e:
if 'No state file was found!' in e.stderr:
if e.returncode == 1 and 'No state file was found' in e.stderr:
log.info('No state file was found, assuming empty list of resources.')
return []
else:
raise
else:
return stdout.splitlines()

schema_path = Path(config.project_root) / 'terraform' / '_schema.json.gz'

Expand Down
Loading
Loading