You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For the recipe to set NON NULL on an existing column, I think there's a bug with one of the statements that will cause locking for extended periods of time in PG12. I have no tried to reproduce on any newer versions of postgres.
ALTERTABLE"products" ALTER COLUMN "active" TYPE boolean, ALTER COLUMN "active"SETNOT NULL;
Even though the type is the same as the existing column type, which should in itself be a no-op or metadata-only change, I believe it's causing postgres to not think that it can optimize the ALTER; it sees that additional work might be needed to be done at the same time and decides not to take the constraint-check shortcut route, thus triggering the table scan again while locked. At least that was my experience on PG 12.14.
Until ecto is smart enough to know that the column type is the same, and strips out the ALTER COLUMN ... TYPE ..., it might be dangerous to run an ecto-generated ALTER TABLE to perform this step vs. handcrafted SQL. Even then, you'd have to be on a certain version of ecto.
Steps to Reproduce
Assuming a table called "foo" with a lot of data and a column called "a" of type text with a UNIQUE constraint:
(I'm using 50 million rows, but locally on a fast NVMe drive w/ enough RAM to fully cache the data set, so single-digit second responses will be larger on an active production environment)
I'm running PG 12.2 for these timings, but we also experienced the issue on PG 12.14.
Creating and validating the constraint. Note the 3.3 seconds to do a full table scan:
testdb =# ALTER TABLE "foo" ADD CONSTRAINT "a_not_null" CHECK (a IS NOT NULL) NOT VALID;ALTERTABLETime: 6.648 ms
testdb=# ALTER TABLE foo VALIDATE CONSTRAINT a_not_null;ALTERTABLETime: 3365.122 ms (00:03.365)
Converting the column to non-null using SQL produced by mix ecto.migrate --log-migrations-sql. Note the inclusion of ALTER COLUMN "a" TYPE text and the time it takes to run being similar to the full table scan:
testdb =# ALTER TABLE "foo" ALTER COLUMN "a" TYPE text, ALTER COLUMN "a" SET NOT NULL;ALTERTABLETime: 3425.920 ms (00:03.426)
testdb =# ALTER TABLE "foo" ALTER COLUMN "a" DROP NOT NULL;ALTERTABLETime: 6.708 ms
testdb =# ALTER TABLE "foo" ALTER COLUMN "a" TYPE text, ALTER COLUMN "a" SET NOT NULL;ALTERTABLETime: 3370.575 ms (00:03.371)
testdb =# ALTER TABLE "foo" ALTER COLUMN "a" DROP NOT NULL;ALTERTABLETime: 6.634 ms
testdb =# ALTER TABLE "foo" ALTER COLUMN "a" TYPE text, ALTER COLUMN "a" SET NOT NULL;ALTERTABLETime: 3418.631 ms (00:03.419)
testdb =# ALTER TABLE "foo" ALTER COLUMN "a" DROP NOT NULL;ALTERTABLETime: 6.586 ms
Setting the NOT NULL without the type. Note response times in the single-digit milliseconds:
testdb =# ALTER TABLE "foo" ALTER COLUMN "a" SET NOT NULL;ALTERTABLETime: 1.475 ms
testdb =# ALTER TABLE "foo" ALTER COLUMN "a" DROP NOT NULL;ALTERTABLETime: 6.428 ms
testdb =# ALTER TABLE "foo" ALTER COLUMN "a" SET NOT NULL;ALTERTABLETime: 6.562 ms
testdb =# ALTER TABLE "foo" ALTER COLUMN "a" DROP NOT NULL;ALTERTABLETime: 6.685 ms
testdb =# ALTER TABLE "foo" ALTER COLUMN "a" SET NOT NULL;ALTERTABLETime: 6.757 ms
testdb =# ALTER TABLE "foo" ALTER COLUMN "a" DROP NOT NULL;ALTERTABLETime: 6.445 ms
Example Table Setup
The following table setup was used to reproduce the issue:
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATETABLEfoo (
id bigintPRIMARY KEY GENERATED ALWAYS AS IDENTITY,
a text,
b text,
c text,
d text,
e text,
f text,
g text,
h text,
i text,
j text,
k text,
l text,
m text,
n text,
o text,
p text,
q text,
r text,
s text,
t text,
u text,
v text,
w text,
x text,
y text,
z text,
CONSTRAINT a_uniq_idx UNIQUE(a)
);
INSERT INTO foo (a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z)
SELECT
uuid_generate_v4(),
concat('b-', i::text),
concat('c-', i::text),
concat('d-', i::text),
concat('e-', i::text),
concat('f-', i::text),
concat('g-', i::text),
concat('h-', i::text),
concat('i-', i::text),
concat('j-', i::text),
concat('k-', i::text),
concat('l-', i::text),
concat('m-', i::text),
concat('n-', i::text),
concat('o-', i::text),
concat('p-', i::text),
concat('q-', i::text),
concat('r-', i::text),
concat('s-', i::text),
concat('t-', i::text),
concat('u-', i::text),
concat('v-', i::text),
concat('w-', i::text),
concat('x-', i::text),
concat('y-', i::text),
concat('z-', i::text)
FROM generate_series(1, 5000000) AS gs(i);
The text was updated successfully, but these errors were encountered:
Looks like someone added a note about this to the ecto_sql docs for Ecto.Migration.modify/3:
If you want to modify a column without changing its type, such as adding or dropping a null constraints, consider using the execute/2 command with the relevant SQL command instead of modify/3, if supported by your database. This may avoid redundant type updates and be more efficient, as an unnecessary type update can lock the table, even if the type actually doesn't change.
For the recipe to set NON NULL on an existing column, I think there's a bug with one of the statements that will cause locking for extended periods of time in PG12. I have no tried to reproduce on any newer versions of postgres.
When it gets to the following step in the docs:
Ecto generates the following SQL under the hood:
Even though the type is the same as the existing column type, which should in itself be a no-op or metadata-only change, I believe it's causing postgres to not think that it can optimize the ALTER; it sees that additional work might be needed to be done at the same time and decides not to take the constraint-check shortcut route, thus triggering the table scan again while locked. At least that was my experience on PG 12.14.
Until ecto is smart enough to know that the column type is the same, and strips out the
ALTER COLUMN ... TYPE ...
, it might be dangerous to run an ecto-generated ALTER TABLE to perform this step vs. handcrafted SQL. Even then, you'd have to be on a certain version of ecto.Steps to Reproduce
Assuming a table called "foo" with a lot of data and a column called "a" of type
text
with a UNIQUE constraint:(I'm using 50 million rows, but locally on a fast NVMe drive w/ enough RAM to fully cache the data set, so single-digit second responses will be larger on an active production environment)
I'm running PG 12.2 for these timings, but we also experienced the issue on PG 12.14.
Creating and validating the constraint. Note the 3.3 seconds to do a full table scan:
Converting the column to non-null using SQL produced by
mix ecto.migrate --log-migrations-sql
. Note the inclusion ofALTER COLUMN "a" TYPE text
and the time it takes to run being similar to the full table scan:Setting the
NOT NULL
without the type. Note response times in the single-digit milliseconds:Example Table Setup
The following table setup was used to reproduce the issue:
The text was updated successfully, but these errors were encountered: