Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add db indexes to SiteDailyMetrics and CourseDailyMetrics #297

Merged
merged 1 commit into from
Dec 8, 2020

Conversation

johnbaldwin
Copy link
Contributor

@johnbaldwin johnbaldwin commented Dec 8, 2020

This commit adds an index to the SiteDailyMetrics field: 'date_for'

This commit adds indexes to the CourseDailyMetrics fields: 'course_id' and 'date_for'

This commit adds a new migration, '0014' to perform non-atomic separate state and database migration actions. Included is compatibility functionality so that the migration works for both MySQL and SQLite.

Effectively this PR is a "copy/paste" and update based on my last migration PR: #288

This commit adds an index to the SiteDailyMetrics field: 'date_for'

This commit adds indexes to the CourseDailyMetrics fields: 'course_id'
and 'date_for'

This commit adds a new migration, '0014' to perform non-atomic separate
state and database migration actions. Included is compatibility
functionality so that the migration works for both MySQL and SQLite.
@johnbaldwin johnbaldwin force-pushed the john/add-indexes-to-daily-metrics branch from f6c6212 to a665aa8 Compare December 8, 2020 10:02
],
database_operations=[
migrations.RunSQL(sql="""
CREATE INDEX figures_coursedailymetrics_course_id_f7047b32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we've done the same before, but could you please share why are we making custom SQL and how would the Django default migration be atomic/blocking/slow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, the default migration does this:

$ ./manage.py sqlmigrate figures 0014
BEGIN;
--
-- Alter field course_id on coursedailymetrics
--
ALTER TABLE "figures_coursedailymetrics" RENAME TO "figures_coursedailymetrics__old";
CREATE TABLE "figures_coursedailymetrics" ("id" integer NOT NULL PRIMARY KEY AUTOINCREMENT, "course_id" varchar(255) NOT NULL, "created" datetime NOT NULL, "modified" datetime NOT NULL, "date_for" date NOT NULL, "enrollment_count" integer NOT NULL, "active_learners_today" integer NOT NULL, "average_progress" decimal NULL, "average_days_to_complete" integer NULL, "num_learners_completed" integer NOT NULL, "site_id" integer NOT NULL REFERENCES "django_site" ("id"));
INSERT INTO "figures_coursedailymetrics" ("created", "average_progress", "site_id", "modified", "num_learners_completed", "enrollment_count", "average_days_to_complete", "course_id", "date_for", "active_learners_today", "id") SELECT "created", "average_progress", "site_id", "modified", "num_learners_completed", "enrollment_count", "average_days_to_complete", "course_id", "date_for", "active_learners_today", "id" FROM "figures_coursedailymetrics__old";
DROP TABLE "figures_coursedailymetrics__old";
CREATE UNIQUE INDEX "figures_coursedailymetrics_course_id_date_for_3674261e_uniq" ON "figures_coursedailymetrics" ("course_id", "date_for");
CREATE INDEX "figures_coursedailymetrics_course_id_f7047b32" ON "figures_coursedailymetrics" ("course_id");
CREATE INDEX "figures_coursedailymetrics_site_id_2d1c0ace" ON "figures_coursedailymetrics" ("site_id");
--
-- Alter field date_for on coursedailymetrics
--
ALTER TABLE "figures_coursedailymetrics" RENAME TO "figures_coursedailymetrics__old";
CREATE TABLE "figures_coursedailymetrics" ("id" integer NOT NULL PRIMARY KEY AUTOINCREMENT, "created" datetime NOT NULL, "modified" datetime NOT NULL, "course_id" varchar(255) NOT NULL, "enrollment_count" integer NOT NULL, "active_learners_today" integer NOT NULL, "average_progress" decimal NULL, "average_days_to_complete" integer NULL, "num_learners_completed" integer NOT NULL, "site_id" integer NOT NULL REFERENCES "django_site" ("id"), "date_for" date NOT NULL);
INSERT INTO "figures_coursedailymetrics" ("created", "average_progress", "site_id", "modified", "num_learners_completed", "enrollment_count", "average_days_to_complete", "course_id", "date_for", "active_learners_today", "id") SELECT "created", "average_progress", "site_id", "modified", "num_learners_completed", "enrollment_count", "average_days_to_complete", "course_id", "date_for", "active_learners_today", "id" FROM "figures_coursedailymetrics__old";
DROP TABLE "figures_coursedailymetrics__old";
CREATE UNIQUE INDEX "figures_coursedailymetrics_course_id_date_for_3674261e_uniq" ON "figures_coursedailymetrics" ("course_id", "date_for");
CREATE INDEX "figures_coursedailymetrics_course_id_f7047b32" ON "figures_coursedailymetrics" ("course_id");
CREATE INDEX "figures_coursedailymetrics_site_id_2d1c0ace" ON "figures_coursedailymetrics" ("site_id");
CREATE INDEX "figures_coursedailymetrics_date_for_481b5758" ON "figures_coursedailymetrics" ("date_for");
--
-- Alter field date_for on sitedailymetrics
--
ALTER TABLE "figures_sitedailymetrics" RENAME TO "figures_sitedailymetrics__old";
CREATE TABLE "figures_sitedailymetrics" ("id" integer NOT NULL PRIMARY KEY AUTOINCREMENT, "created" datetime NOT NULL, "modified" datetime NOT NULL, "cumulative_active_user_count" integer NULL, "todays_active_user_count" integer NULL, "total_user_count" integer NOT NULL, "course_count" integer NOT NULL, "total_enrollment_count" integer NOT NULL, "site_id" integer NOT NULL REFERENCES "django_site" ("id"), "mau" integer NULL, "date_for" date NOT NULL);
INSERT INTO "figures_sitedailymetrics" ("cumulative_active_user_count", "total_user_count", "todays_active_user_count", "mau", "created", "site_id", "modified", "course_count", "date_for", "id", "total_enrollment_count") SELECT "cumulative_active_user_count", "total_user_count", "todays_active_user_count", "mau", "created", "site_id", "modified", "course_count", "date_for", "id", "total_enrollment_count" FROM "figures_sitedailymetrics__old";
DROP TABLE "figures_sitedailymetrics__old";
CREATE INDEX "figures_sitedailymetrics_site_id_c5276d67" ON "figures_sitedailymetrics" ("site_id");
CREATE INDEX "figures_sitedailymetrics_date_for_4d95be72" ON "figures_sitedailymetrics" ("date_for");
COMMIT;

and the custom migration does this:

$ ./manage.py sqlmigrate figures 0014
--
-- Custom state/database change combination
--
CREATE INDEX figures_coursedailymetrics_course_id_f7047b32
                    ON figures_coursedailymetrics (course_id);
CREATE INDEX figures_coursedailymetrics_date_for_481b5758
                    ON figures_coursedailymetrics (date_for);
CREATE INDEX figures_sitedailymetrics_date_for_4d95be72
                    ON figures_sitedailymetrics (date_for);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since, AFAIK from the official docs, community conversation, MySQL 5.6+ does non-locking index creation by default, meaning zero downtime indexes. The default Django way needs to drop tables and recreate them, which, AFAIK, means the tables need to be locked in order to copy without losing data.

Although this might not strictly be needed for Figures TODAY as it is mostly a read only system except for pipeline jobs, we're practicing doing it the zero downtime way so we've got it in our toolbox because next week, we'll need to not lock Figures tables because we're introducing signals to trigger updates to Figures models

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow! that's a ton more changes that just create index. thanks john!

@johnbaldwin johnbaldwin merged commit f1c37cc into master Dec 8, 2020
@johnbaldwin johnbaldwin deleted the john/add-indexes-to-daily-metrics branch December 8, 2020 11:15
johnbaldwin added a commit that referenced this pull request Dec 8, 2020
Figures Release 0.4.dev2

* 0.4.dev2 version bump and settings improvement

    * #296

* Add db indexes to SiteDailyMetrics and CourseDailyMetrics

    * #297
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants