This repository has been archived by the owner on Feb 22, 2023. It is now read-only.
Add documentation describing the data migration process we should follow #1030
Labels
📄 aspect: text
Concerns the textual material in the repository
🌟 goal: addition
Addition of new feature
🟩 priority: low
Low priority and doesn't need to be rushed
Problem
As part of the ECS-ification of the Django service, we are moving towards an automated database migration handling approach. Under this new approach, migrations will be automatically applied when a new version of the Django application is deployed. In order to avoid deployments that take hours, we must not rely on Django database migrations for data migrations: that is, we cannot rely on SQL to transform the data in the database. In addition to creating migrations that could last hours (depending on the contents of them), we also want to avoid creating additional database load.
Description
En lieu of using Django migrations to transform the data in the database, we will instead follow a data migration strategy that relies on Django management commands to programmatically transform the data. This has several benefits (some repeated from above):
We need to document this process in the Sphinx documentation and spread the word about this to the Openverse contributors. If possible, it would be nice to even put a linting check that verifies that we are not introducing migrations that include data transformations.
Alternatives
Additional context
Please refer to https://github.com/WordPress/openverse-infrastructure/issues/176 for the original discussion motivating this change. The repository is private and if you do not have access but would like to see the issue, ping a core contributor, and they can share the discussion with you.
Implementation
The text was updated successfully, but these errors were encountered: