Skip to content

Commit

Permalink
[ci] Create a test MySQL server in test and dev namespaces (#13030)
Browse files Browse the repository at this point in the history
This change creates mysql pods for test and dev namespaces, instead of
sharing the CloudSQL database server. The areas of change are as
follows:

### Generation of the namespace's database-server-config
The current approach in main does a little trick. Since the current
`createDatabase` step uses the `database-server-config` from default to
generate admin/user sql configs, the CI pipeline creates a dummy
database `test-database-instance` to create a
`sql-test-instance-admin-config` that inherits the credentials from the
production `database-server-config`, and then copies that within the
test namespace to `database-server-config`.

In this change, since we are creating the server ourselves, we can just
replace these with a step that creates a `database-server-config` from
scratch, and then uses that for the DB pod.

Overall making these changes really gave me the heebie jeebies that the
test and dev namespaces have all these credentials to the CloudSQL
server. I'm glad this gets rid of that.

### Accessing the database server
We use the DB pod's service DNS name as the `host` so inside Kubernetes
this Just Works. The one caveat is the CI pipeline in which we run
migrations in batch jobs. Those jobs need a way to reach the DB pod. I
achieve this with a NodePort and then use the job's K8s credentials to
resolve the node and port that the DB is on. The code I've added to do
this resolution feels a bit janky, wouldn't mind some feedback on that.
In terms of security, if a user job was able to somehow resolve the
address of a test db, they would still not have the credentials to
access it, and this is currently also the case with the production
database. Nevertheless, this does raise an action item that we should
only allow traffic to the k8s and DB subnets for `network=private` jobs,
but I think we should make that a separate PR.

### Database creation
In order to test this properly in a dev deploy, I needed to make some
changes to `create_database.py`. In main, dev deploys can't create
databases. I think they should be able to, and those operations should
just be idempotent. When allowing dev deploys to create databases, I hit
the `ALTER USER` lines in `create_database.py` which lock out any
already-deployed application, which feels silly. Instead, I create the
mysql user and create the corresponding secret iff the mysql username
does not already exist.

### create_initial_account.py
When starting off with a fresh database, we encounter a bootstrapping
problem where we need to add the developer's user. The current way to
add the developer's user is through `create_initial_account.py` which
assumes that the developer's gsa key secret is not already in the
namespace, but now it could be, so I delete the key in the namespace if
it exists before creating it. This could all change with my other PR to
add devs to test namespaces. But this change to allow ephemeral dev
databases will make testing that other PR *way* easier.
  • Loading branch information
daniel-goldstein authored May 12, 2023
1 parent 8362afa commit ae7cdee
Show file tree
Hide file tree
Showing 11 changed files with 316 additions and 119 deletions.
76 changes: 47 additions & 29 deletions build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -255,35 +255,43 @@ steps:
dependsOn:
- base_image
- merge_code
- kind: createDatabase
name: test_database_instance
databaseName: test-instance
image:
valueFrom: ci_utils_image.image
migrations: []
namespace:
valueFrom: default_ns.name
scopes:
- test
dependsOn:
- default_ns
- ci_utils_image
- kind: runImage
name: create_database_server_config
name: create_test_database_server_config
image:
valueFrom: ci_utils_image.image
valueFrom: create_certs_image.image
script: |
kubectl -n {{ default_ns.name }} get -o json secret {{ test_database_instance.admin_secret_name }} | jq '{apiVersion, kind, type, data, metadata: {name: "database-server-config"}}' | kubectl -n {{ default_ns.name }} apply -f -
set -ex
if ! kubectl get secret -n {{ default_ns.name }} database-server-config;
then
NAMESPACE={{ default_ns.name }} bash /create_test_db_config.sh
fi
serviceAccount:
name: admin
namespace:
valueFrom: default_ns.name
scopes:
- dev
- test
dependsOn:
- default_ns
- ci_utils_image
- test_database_instance
- create_certs_image
- kind: deploy
name: deploy_test_db
namespace:
valueFrom: default_ns.name
config: docker/mysql/db.yaml
wait:
- kind: Service
name: db
for: alive
resource_type: statefulset
scopes:
- dev
- test
dependsOn:
- default_ns
- create_test_database_server_config
- kind: buildImage2
name: admin_pod_image
dockerFile: /io/repo/admin-pod/Dockerfile
Expand Down Expand Up @@ -316,8 +324,8 @@ steps:
- default_ns
- admin_pod_image
- merge_code
- create_database_server_config
- kind: createDatabase
- create_test_database_server_config
- kind: createDatabase2
name: auth_database
databaseName: auth
image:
Expand Down Expand Up @@ -354,6 +362,8 @@ steps:
- merge_code
- delete_auth_tables
- ci_utils_image
- create_test_database_server_config
- deploy_test_db
- kind: runImage
name: create_deploy_config
image:
Expand Down Expand Up @@ -525,7 +535,7 @@ steps:
- create_deploy_config
- deploy_auth_driver_service_account
- create_test_gsa_keys
- create_database_server_config
- create_test_database_server_config
- kind: runImage
name: create_initial_user
runIfRequested: true
Expand Down Expand Up @@ -1697,8 +1707,8 @@ steps:
- default_ns
- admin_pod_image
- merge_code
- create_database_server_config
- kind: createDatabase
- create_test_database_server_config
- kind: createDatabase2
name: monitoring_database
databaseName: monitoring
image:
Expand All @@ -1721,6 +1731,8 @@ steps:
- merge_code
- delete_monitoring_tables
- ci_utils_image
- create_test_database_server_config
- deploy_test_db
- kind: deploy
name: deploy_monitoring
namespace:
Expand Down Expand Up @@ -1908,7 +1920,7 @@ steps:
- default_ns
- admin_pod_image
- merge_code
- create_database_server_config
- create_test_database_server_config
- kind: runImage
name: delete_ci_tables
image:
Expand All @@ -1931,8 +1943,8 @@ steps:
- default_ns
- admin_pod_image
- merge_code
- create_database_server_config
- kind: createDatabase
- create_test_database_server_config
- kind: createDatabase2
name: ci_database
databaseName: ci
image:
Expand Down Expand Up @@ -1966,7 +1978,9 @@ steps:
- merge_code
- delete_ci_tables
- ci_utils_image
- kind: createDatabase
- create_test_database_server_config
- deploy_test_db
- kind: createDatabase2
name: batch_database
databaseName: batch
image:
Expand Down Expand Up @@ -2200,6 +2214,8 @@ steps:
- merge_code
- delete_batch_tables
- ci_utils_image
- create_test_database_server_config
- deploy_test_db
- kind: deploy
name: deploy_batch
namespace:
Expand Down Expand Up @@ -2663,6 +2679,7 @@ steps:
cp -R /io/repo/ci/* ./ci/
cp /io/repo/tls/Dockerfile ./ci/test/resources/Dockerfile.certs
cp /io/repo/tls/create_certs.py ./ci/test/resources/
cp /io/repo/tls/create_test_db_config.sh ./ci/test/resources/
cp /io/repo/pylintrc ./
cp /io/repo/setup.cfg ./
cp /io/repo/pyproject.toml ./
Expand Down Expand Up @@ -2727,7 +2744,6 @@ steps:
for: alive
dependsOn:
- default_ns
- create_database_server_config
- ci_image
- ci_utils_image
- create_accounts
Expand Down Expand Up @@ -3007,7 +3023,7 @@ steps:
inputs:
- from: /repo/hail/python/hailtop
to: /io/hailtop
- kind: createDatabase
- kind: createDatabase2
name: notebook_database
databaseName: notebook
image:
Expand All @@ -3029,6 +3045,8 @@ steps:
- default_ns
- merge_code
- ci_utils_image
- create_test_database_server_config
- deploy_test_db
- kind: deploy
name: deploy_notebook
namespace:
Expand Down
8 changes: 2 additions & 6 deletions ci/ci/build.py
Original file line number Diff line number Diff line change
Expand Up @@ -1190,14 +1190,10 @@ def __init__(self, params, database_name, namespace, migrations, shutdowns, inpu
self.create_database_job = None
self.cleanup_job = None

self.cant_create_database = is_test_deployment
self.cant_create_database = False

# MySQL user name can be up to 16 characters long before MySQL 5.7.8 (32 after)
if self.cant_create_database:
self._name = None
self.admin_username = None
self.user_username = None
elif params.scope == 'deploy':
if params.scope == 'deploy':
self._name = database_name
self.admin_username = f'{database_name}-admin'
self.user_username = f'{database_name}-user'
Expand Down
132 changes: 51 additions & 81 deletions ci/create_database.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,9 @@
from shlex import quote as shq
from typing import Optional

from gear import Database
import orjson

from gear import Database, resolve_test_db_endpoint
from hailtop.auth.sql_config import SQLConfig, create_secret_data_from_config
from hailtop.utils import check_shell, check_shell_output

Expand Down Expand Up @@ -62,19 +64,8 @@ async def create_database():

namespace = create_database_config['namespace']
database_name = create_database_config['database_name']
cant_create_database = create_database_config['cant_create_database']

if cant_create_database:
assert sql_config.db is not None

await write_user_config(namespace, database_name, 'admin', sql_config)
await write_user_config(namespace, database_name, 'user', sql_config)
return

scope = create_database_config['scope']
_name = create_database_config['_name']
admin_username = create_database_config['admin_username']
user_username = create_database_config['user_username']

db = Database()
await db.async_init()
Expand All @@ -89,79 +80,54 @@ async def create_database():
assert len(rows) == 1
return

with open(create_database_config['admin_password_file'], encoding='utf-8') as f:
admin_password = f.read()

with open(create_database_config['user_password_file'], encoding='utf-8') as f:
user_password = f.read()

admin_exists = await db.execute_and_fetchone("SELECT user FROM mysql.user WHERE user=%s", (admin_username,))
admin_exists = admin_exists and admin_exists.get('user') == admin_username

user_exists = await db.execute_and_fetchone("SELECT user FROM mysql.user WHERE user=%s", (user_username,))
user_exists = user_exists and user_exists.get('user') == user_username

create_admin_or_alter_password = (
f"CREATE USER '{admin_username}'@'%' IDENTIFIED BY '{admin_password}';"
if not admin_exists
else "ALTER USER '{admin_username}'@'%' IDENTIFIED BY '{admin_password}';"
)
async def create_user_if_doesnt_exist(admin_or_user, mysql_username, mysql_password):
existing_user = await db.execute_and_fetchone('SELECT 1 FROM mysql.user WHERE user=%s', (mysql_username,))
if existing_user is not None:
return

create_user_or_alter_password = (
f"CREATE USER '{user_username}'@'%' IDENTIFIED BY '{user_password}';"
if not user_exists
else "ALTER USER '{user_username}'@'%' IDENTIFIED BY '{user_password}';"
)
if admin_or_user == 'admin':
allowed_operations = 'ALL'
else:
assert admin_or_user == 'user'
allowed_operations = 'SELECT, INSERT, UPDATE, DELETE, EXECUTE'

await db.just_execute(
f'''
CREATE DATABASE IF NOT EXISTS `{_name}`;
await db.just_execute(
f'''
CREATE USER '{mysql_username}'@'%' IDENTIFIED BY '{mysql_password}';
GRANT {allowed_operations} ON `{_name}`.* TO '{mysql_username}'@'%';
'''
)

{create_admin_or_alter_password}
GRANT ALL ON `{_name}`.* TO '{admin_username}'@'%';
await write_user_config(
namespace,
database_name,
admin_or_user,
SQLConfig(
host=sql_config.host,
port=sql_config.port,
instance=sql_config.instance,
connection_name=sql_config.connection_name,
user=mysql_username,
password=mysql_password,
db=_name,
ssl_ca=sql_config.ssl_ca,
ssl_cert=sql_config.ssl_cert,
ssl_key=sql_config.ssl_key,
ssl_mode=sql_config.ssl_mode,
),
)

{create_user_or_alter_password}
GRANT SELECT, INSERT, UPDATE, DELETE, EXECUTE ON `{_name}`.* TO '{user_username}'@'%';
'''
)
admin_username = create_database_config['admin_username']
user_username = create_database_config['user_username']

await write_user_config(
namespace,
database_name,
'admin',
SQLConfig(
host=sql_config.host,
port=sql_config.port,
instance=sql_config.instance,
connection_name=sql_config.connection_name,
user=admin_username,
password=admin_password,
db=_name,
ssl_ca=sql_config.ssl_ca,
ssl_cert=sql_config.ssl_cert,
ssl_key=sql_config.ssl_key,
ssl_mode=sql_config.ssl_mode,
),
)
with open(create_database_config['admin_password_file'], encoding='utf-8') as f:
admin_password = f.read()
with open(create_database_config['user_password_file'], encoding='utf-8') as f:
user_password = f.read()

await write_user_config(
namespace,
database_name,
'user',
SQLConfig(
host=sql_config.host,
port=sql_config.port,
instance=sql_config.instance,
connection_name=sql_config.connection_name,
user=user_username,
password=user_password,
db=_name,
ssl_ca=sql_config.ssl_ca,
ssl_cert=sql_config.ssl_cert,
ssl_key=sql_config.ssl_key,
ssl_mode=sql_config.ssl_mode,
),
)
await db.just_execute(f'CREATE DATABASE IF NOT EXISTS `{_name}`')
await create_user_if_doesnt_exist('admin', admin_username, admin_password)
await create_user_if_doesnt_exist('user', user_username, user_password)


did_shutdown = False
Expand Down Expand Up @@ -256,11 +222,15 @@ async def async_main():
)
admin_secret = json.loads(out)

admin_sql_config = SQLConfig.from_json(base64.b64decode(admin_secret['data']['sql-config.json']).decode())
if namespace != 'default' and admin_sql_config.host.endswith('.svc.cluster.local'):
admin_sql_config = await resolve_test_db_endpoint(admin_sql_config)

with open('/sql-config.json', 'wb') as f:
f.write(base64.b64decode(admin_secret['data']['sql-config.json']))
f.write(orjson.dumps(admin_sql_config.to_dict()))

with open('/sql-config.cnf', 'wb') as f:
f.write(base64.b64decode(admin_secret['data']['sql-config.cnf']))
f.write(admin_sql_config.to_cnf().encode('utf-8'))

os.environ['HAIL_DATABASE_CONFIG_FILE'] = '/sql-config.json'
os.environ['HAIL_SCOPE'] = scope
Expand Down
8 changes: 8 additions & 0 deletions ci/create_initial_account.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,14 @@ async def copy_identity_from_default(hail_credentials_secret_name: str) -> str:

secret = await k8s_client.read_namespaced_secret(hail_credentials_secret_name, 'default')

try:
await k8s_client.delete_namespaced_secret(hail_credentials_secret_name, NAMESPACE)
except kubernetes_asyncio.client.rest.ApiException as e:
if e.status == 404:
pass
else:
raise

await k8s_client.create_namespaced_secret(
NAMESPACE,
kubernetes_asyncio.client.V1Secret(
Expand Down
4 changes: 4 additions & 0 deletions ci/test/resources/build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -236,6 +236,10 @@ steps:
namespace:
valueFrom: default_ns.name
mountPath: /sql-config
serviceAccount:
name: admin
namespace:
valueFrom: default_ns.name
dependsOn:
- default_ns
- hello_database
Expand Down
Loading

0 comments on commit ae7cdee

Please sign in to comment.