Skip to content

Commit

Permalink
fix: backend/Makefile local db improvements, migration README (#3029)
Browse files Browse the repository at this point in the history
* fix: backend/Makefile local db improvements, migration README

- Rename the targets for loading data and loading schema into local test
  db
- Adapt the commands for the above two scripts to actually work (??)

* doc updates

* additional doc update
  • Loading branch information
Daniel Hegeman authored Aug 29, 2022
1 parent 81094f5 commit aa6d662
Show file tree
Hide file tree
Showing 2 changed files with 49 additions and 34 deletions.
16 changes: 9 additions & 7 deletions backend/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -63,17 +63,19 @@ db/dump:
PGPASSWORD=${DB_PW} pg_dump -Fc --dbname=corpora_${DEPLOYMENT_STAGE} --file=$(OUTFILE) --host 0.0.0.0 --username corpora_${DEPLOYMENT_STAGE}
$(MAKE) db/tunnel/down

db/load/local:
db/local/load-data:
# Loads corpora_dev.sqlc into the local Docker env corpora database
# Usage: make db/load/local INFILE=<file>
docker-compose exec database pg_restore --clean --no-owner --username corpora --dbname corpora $(INFILE)
# Usage: make db/local/load-data INFILE=<file>
$(eval DB_PW = $(shell aws secretsmanager get-secret-value --secret-id corpora/backend/test/database --region us-west-2 | jq -r '.SecretString | match(":([^:]*)@").captures[0].string'))
PGPASSWORD=${DB_PW} pg_restore --clean --no-owner --host 0.0.0.0 --username corpora --dbname corpora $(INFILE)

db/load/schema:
db/local/load-schema:
# Imports the corpora_dev.sqlc schema (schema ONLY) into the corpora_test database
# Usage: DEPLOYMENT_STAGE=test make db/import/schema
pg_restore --schema-only --clean --no-owner --dbname corpora_test corpora_$(DEPLOYMENT_STAGE).sqlc
# Usage: make db/local/load-schema INFILE=<file>
$(eval DB_PW = $(shell aws secretsmanager get-secret-value --secret-id corpora/backend/test/database --region us-west-2 | jq -r '.SecretString | match(":([^:]*)@").captures[0].string'))
PGPASSWORD=${DB_PW} pg_restore --schema-only --clean --no-owner --host 0.0.0.0 --dbname corpora $(INFILE)
# Also import alembic schema version
pg_restore --data-only --table=alembic_version --no-owner --dbname corpora_test corpora_$(DEPLOYMENT_STAGE).sqlc
PGPASSWORD=${DB_PW} pg_restore --data-only --table=alembic_version --no-owner --host 0.0.0.0 --dbname corpora $(INFILE)

db/dump_schema:
ifeq ($(DEPLOYMENT_STAGE),test)
Expand Down
67 changes: 40 additions & 27 deletions backend/database/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@

- General information about Alembic migrations can be found [here](https://alembic.sqlalchemy.org/en/latest/index.html).
- For database [make recipes](../Makefile)
- See [Environment variables](../../README.md#environment-variables) for
usage explanation of `DEPLOYMENT_STAGE`, `AWS_PROFILE` and `CORPORA_LOCAL_DEV`
- See [Environment variables](../../README.md#environment-variables) for
usage explanation of `DEPLOYMENT_STAGE`, `AWS_PROFILE` and `CORPORA_LOCAL_DEV`
- `$REPO_ROOT` - root directory where the `single-cell-data-portal` project is cloned (e.g. `~/PyCharmProjects/single-cell-data-portal`)

## How to perform a database migration
Expand All @@ -14,13 +14,14 @@ usage explanation of `DEPLOYMENT_STAGE`, `AWS_PROFILE` and `CORPORA_LOCAL_DEV`
3. In the generated file, edit the `upgrade()` and `downgrade()` functions such that `upgrade()` contains the Alembic DDL commands to perform the migration you would like and `downgrade()` contains the commands to undo it.
4. Rename the generated file by prepending the migration count to the filename (`xxx_purpose_of_migration.py` -> `03_xxx_purpose_of_migration.py`)
5. In the generated file, update the `Revision ID` and the `revision` (used by Alembic) to include the migration count.
For example `Revision ID: a8cd0dc08805` becomes `Revision ID: 18_a8cd0dc08805` and `revision = "a8cd0dc08805"` becomes `revision = "18_a8cd0dc08805"`
For example `Revision ID: a8cd0dc08805` becomes `Revision ID: 18_a8cd0dc08805` and `revision = "a8cd0dc08805"` becomes `revision = "18_a8cd0dc08805"`
6. [Test your migration](#test-a-migration)
7. Check that [corpora_orm.py](../corpora/common/corpora_orm.py) matches up with your changes.
8. Once you've completed the changes, create a PR to get the functions reviewed.
9. Once the PR is merged, migrations will be run as part of the deployment process to each env.
10. [Connect to Remote RDS](#connect-to-remote-rds) to single-cell-dev
11. In a new terminal, complete the migration in the single-cell-dev test env by running:

```shell
cd $REPO_ROOT/backend
DEPLOYMENT_STAGE=test make db/migrate
Expand All @@ -30,55 +31,68 @@ DEPLOYMENT_STAGE=test make db/migrate

1. Make changes to the ORM class(es) in [corpora_orm.py](../corpora/common/corpora_orm.py)
2. [Connect to Remote RDS](#connect-to-remote-rds). Note, generally, you would be connecting to prod
(`AWS_PROFILE=single-cell-prod DEPLOYMENT_STAGE=prod`) since we want to generate
a migration from the database schema currently deployed in prod. However, if there are migrations haven't been
deployed to prod yet, you would connect to staging here.
(`AWS_PROFILE=single-cell-prod DEPLOYMENT_STAGE=prod`) since we want to generate
a migration from the database schema currently deployed in prod. However, if there are migrations haven't been
deployed to prod yet, you would connect to staging here.
3. Autogenerate the migration using the steps below. `AWS_PROFILE` and `DEPLOYMENT_STAGE` should be the same values
used in the previous [Connect to Remote RDS](#connect-to-remote-rds) step. For details about Alembic's migration autogeneration,
see [What does Autogenerate Detect (and what does it not detect?)](https://alembic.sqlalchemy.org/en/latest/autogenerate.html#what-does-autogenerate-detect-and-what-does-it-not-detect)
used in the previous [Connect to Remote RDS](#connect-to-remote-rds) step. For details about Alembic's migration autogeneration,
see [What does Autogenerate Detect (and what does it not detect?)](https://alembic.sqlalchemy.org/en/latest/autogenerate.html#what-does-autogenerate-detect-and-what-does-it-not-detect)

```shell
cd $REPO_ROOT/backend
AWS_PROFILE=single-cell-{dev,prod} DEPLOYMENT_STAGE={dev,staging,prod} CORPORA_LOCAL_DEV=1 make db/new_migration_auto MESSAGE="purpose_of_migration"
```
4. Follow [How to perform a database migration](#how-to-perform-a-database-migration) starting from **step 3**
(i.e. editing the `upgrade()` and `downgrade()` functions).

4. Follow [How to perform a database migration](#how-to-perform-a-database-migration) starting from **step 3**
(i.e. editing the `upgrade()` and `downgrade()` functions).

### Test a Migration
The following steps will test that a migration script works on a local database using data downloaded from a deployed database.

The following steps will test that a migration script works on a local database using data downloaded from a deployed database.

1. [Connect to Remote RDS](#connect-to-remote-rds)
2. Open a new terminal and using the same values for `AWS_PROFILE` and `DEPLOYMENT_STAGE`, download the remote database schema:
2. Open a new terminal and using the same values for `AWS_PROFILE` and `DEPLOYMENT_STAGE`, download the remote dev database schema:

```shell
cd $REPO_ROOT/backend
AWS_PROFILE=single-cell-{dev,prod} DEPLOYMENT_STAGE={dev,staging,prod} make db/download
AWS_PROFILE=single-cell-{dev,prod} DEPLOYMENT_STAGE={dev,staging,prod} make db/dump OUTFILE=corpora_dev.sqlc
```
This will download the database into the `$REPO_ROOT/backend/database` directory in a file named `corpora_${DEPLOYMENT_STAGE}-<YYYYmmddHHMM>.sqlc`.

3. Close the tunnel to the remote database
This will download the database to `$REPO_ROOT/backend/corpora_dev.sqlc`.

3. The tunnel to dev should close automatically (but worth verifying `ps ax | grep ssh`)
4. Start the local database environment:

```shell
cd $REPO_ROOT
make local-start
```

5. Import the remote database schema into your local database:

```shell
cd $REPO_ROOT/backend
DEPLOYMENT_STAGE=test make db/import FROM=corpora_{dev,staging,prod}-<YYYYmmddHHMM> # exclude the .sqlc extension
make db/local/load-schema INFILE=corpora_dev.sqlc
```
where the `FROM` parameter is the base name of the `.sqlc` file downloaded from the `make db/download` step above. For example

where the `INFILE` parameter is the base name of the `.sqlc` file downloaded from the `make db/dump` step above. For example

```shell
make db/import FROM=corpora_prod-202102221309
make db/local/load-schema INFILE=corpora_dev.sqlc
```
- Note: The file is stored locally under `$REPO_ROOT/backend/database/corpora_${DEPLOYMENT_STAGE}-<YYYYmmddHHMM>.sqlc`
but the `make db/import` command retrieves it from `/import/$(FROM).sqlc` due to the way [the local paths are mapped to the Docker container](https://github.com/chanzuckerberg/single-cell-data-portal/blob/ffca067b9e4aea237fa2bd7c7a9cbc5813ebd449/docker-compose.yml#L13)

You may need to run this a few times, until there are no significant errors.
- Note: `pg_restore: error: could not execute query: ERROR: role "rdsadmin" does not exist` is not a significant error

- Note: `pg_restore: error: could not execute query: ERROR: role "rdsadmin" does not exist` is not a significant error

6. Run the migration test:

```shell
AWS_PROFILE=single-cell-{dev,prod} DEPLOYMENT_STAGE=test make db/test_migration
```
```

This test will:

1. Dump the current schema (before)
1. Apply the migration (upgrade)
1. Rollback the migration (downgrade)
Expand All @@ -88,19 +102,18 @@ This test will:
If there are no differences then the test passed. If the test didn't pass, make adjustments to your migration script and restart from step 5. Repeat until there are no errors.

## Connect to Remote RDS

Enable local connection to the private RDS instance:

- Note: Since the default PostgreSQL port is `5432`, the above command will conflict with a local PostgreSQL instance.
To stop it run `make local-stop` from the `$REPO_ROOT` directory.

To stop it run `make local-stop` from the `$REPO_ROOT` directory.

```shell
cd $REPO_ROOT/backend
AWS_PROFILE=single-cell-{dev,prod} DEPLOYMENT_STAGE={dev,staging,prod} make db/tunnel
```

This command opens an SSH tunnel from `localhost:5432` to the RDS connection endpoint via the *bastion* server.
The local port `5432` is fixed and encoded in the DB connection string stored in
This command opens an SSH tunnel from `localhost:5432` to the RDS connection endpoint via the _bastion_ server.
The local port `5432` is fixed and encoded in the DB connection string stored in
[AWS Secrets Manager](https://us-west-2.console.aws.amazon.com/secretsmanager/home?region=us-west-2#!/listSecrets/)
in the secret named `corpora/backend/${DEPLOYMENT_STAGE}/database_local`.

0 comments on commit aa6d662

Please sign in to comment.