-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add db indexes to SiteDailyMetrics and CourseDailyMetrics #297
Conversation
This commit adds an index to the SiteDailyMetrics field: 'date_for' This commit adds indexes to the CourseDailyMetrics fields: 'course_id' and 'date_for' This commit adds a new migration, '0014' to perform non-atomic separate state and database migration actions. Included is compatibility functionality so that the migration works for both MySQL and SQLite.
f6c6212
to
a665aa8
Compare
], | ||
database_operations=[ | ||
migrations.RunSQL(sql=""" | ||
CREATE INDEX figures_coursedailymetrics_course_id_f7047b32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we've done the same before, but could you please share why are we making custom SQL and how would the Django default migration be atomic/blocking/slow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, the default migration does this:
$ ./manage.py sqlmigrate figures 0014
BEGIN;
--
-- Alter field course_id on coursedailymetrics
--
ALTER TABLE "figures_coursedailymetrics" RENAME TO "figures_coursedailymetrics__old";
CREATE TABLE "figures_coursedailymetrics" ("id" integer NOT NULL PRIMARY KEY AUTOINCREMENT, "course_id" varchar(255) NOT NULL, "created" datetime NOT NULL, "modified" datetime NOT NULL, "date_for" date NOT NULL, "enrollment_count" integer NOT NULL, "active_learners_today" integer NOT NULL, "average_progress" decimal NULL, "average_days_to_complete" integer NULL, "num_learners_completed" integer NOT NULL, "site_id" integer NOT NULL REFERENCES "django_site" ("id"));
INSERT INTO "figures_coursedailymetrics" ("created", "average_progress", "site_id", "modified", "num_learners_completed", "enrollment_count", "average_days_to_complete", "course_id", "date_for", "active_learners_today", "id") SELECT "created", "average_progress", "site_id", "modified", "num_learners_completed", "enrollment_count", "average_days_to_complete", "course_id", "date_for", "active_learners_today", "id" FROM "figures_coursedailymetrics__old";
DROP TABLE "figures_coursedailymetrics__old";
CREATE UNIQUE INDEX "figures_coursedailymetrics_course_id_date_for_3674261e_uniq" ON "figures_coursedailymetrics" ("course_id", "date_for");
CREATE INDEX "figures_coursedailymetrics_course_id_f7047b32" ON "figures_coursedailymetrics" ("course_id");
CREATE INDEX "figures_coursedailymetrics_site_id_2d1c0ace" ON "figures_coursedailymetrics" ("site_id");
--
-- Alter field date_for on coursedailymetrics
--
ALTER TABLE "figures_coursedailymetrics" RENAME TO "figures_coursedailymetrics__old";
CREATE TABLE "figures_coursedailymetrics" ("id" integer NOT NULL PRIMARY KEY AUTOINCREMENT, "created" datetime NOT NULL, "modified" datetime NOT NULL, "course_id" varchar(255) NOT NULL, "enrollment_count" integer NOT NULL, "active_learners_today" integer NOT NULL, "average_progress" decimal NULL, "average_days_to_complete" integer NULL, "num_learners_completed" integer NOT NULL, "site_id" integer NOT NULL REFERENCES "django_site" ("id"), "date_for" date NOT NULL);
INSERT INTO "figures_coursedailymetrics" ("created", "average_progress", "site_id", "modified", "num_learners_completed", "enrollment_count", "average_days_to_complete", "course_id", "date_for", "active_learners_today", "id") SELECT "created", "average_progress", "site_id", "modified", "num_learners_completed", "enrollment_count", "average_days_to_complete", "course_id", "date_for", "active_learners_today", "id" FROM "figures_coursedailymetrics__old";
DROP TABLE "figures_coursedailymetrics__old";
CREATE UNIQUE INDEX "figures_coursedailymetrics_course_id_date_for_3674261e_uniq" ON "figures_coursedailymetrics" ("course_id", "date_for");
CREATE INDEX "figures_coursedailymetrics_course_id_f7047b32" ON "figures_coursedailymetrics" ("course_id");
CREATE INDEX "figures_coursedailymetrics_site_id_2d1c0ace" ON "figures_coursedailymetrics" ("site_id");
CREATE INDEX "figures_coursedailymetrics_date_for_481b5758" ON "figures_coursedailymetrics" ("date_for");
--
-- Alter field date_for on sitedailymetrics
--
ALTER TABLE "figures_sitedailymetrics" RENAME TO "figures_sitedailymetrics__old";
CREATE TABLE "figures_sitedailymetrics" ("id" integer NOT NULL PRIMARY KEY AUTOINCREMENT, "created" datetime NOT NULL, "modified" datetime NOT NULL, "cumulative_active_user_count" integer NULL, "todays_active_user_count" integer NULL, "total_user_count" integer NOT NULL, "course_count" integer NOT NULL, "total_enrollment_count" integer NOT NULL, "site_id" integer NOT NULL REFERENCES "django_site" ("id"), "mau" integer NULL, "date_for" date NOT NULL);
INSERT INTO "figures_sitedailymetrics" ("cumulative_active_user_count", "total_user_count", "todays_active_user_count", "mau", "created", "site_id", "modified", "course_count", "date_for", "id", "total_enrollment_count") SELECT "cumulative_active_user_count", "total_user_count", "todays_active_user_count", "mau", "created", "site_id", "modified", "course_count", "date_for", "id", "total_enrollment_count" FROM "figures_sitedailymetrics__old";
DROP TABLE "figures_sitedailymetrics__old";
CREATE INDEX "figures_sitedailymetrics_site_id_c5276d67" ON "figures_sitedailymetrics" ("site_id");
CREATE INDEX "figures_sitedailymetrics_date_for_4d95be72" ON "figures_sitedailymetrics" ("date_for");
COMMIT;
and the custom migration does this:
$ ./manage.py sqlmigrate figures 0014
--
-- Custom state/database change combination
--
CREATE INDEX figures_coursedailymetrics_course_id_f7047b32
ON figures_coursedailymetrics (course_id);
CREATE INDEX figures_coursedailymetrics_date_for_481b5758
ON figures_coursedailymetrics (date_for);
CREATE INDEX figures_sitedailymetrics_date_for_4d95be72
ON figures_sitedailymetrics (date_for);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since, AFAIK from the official docs, community conversation, MySQL 5.6+ does non-locking index creation by default, meaning zero downtime indexes. The default Django way needs to drop tables and recreate them, which, AFAIK, means the tables need to be locked in order to copy without losing data.
Although this might not strictly be needed for Figures TODAY as it is mostly a read only system except for pipeline jobs, we're practicing doing it the zero downtime way so we've got it in our toolbox because next week, we'll need to not lock Figures tables because we're introducing signals to trigger updates to Figures models
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow! that's a ton more changes that just create index
. thanks john!
This commit adds an index to the SiteDailyMetrics field: 'date_for'
This commit adds indexes to the CourseDailyMetrics fields: 'course_id' and 'date_for'
This commit adds a new migration, '0014' to perform non-atomic separate state and database migration actions. Included is compatibility functionality so that the migration works for both MySQL and SQLite.
Effectively this PR is a "copy/paste" and update based on my last migration PR: #288