Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BW-1398] Migrate PKs to BIGINT #6907

Merged
merged 4 commits into from
Sep 16, 2022
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,15 @@
# Cromwell Change Log

## 85 Release Notes

### Migration of PKs to BIGINT

The PK of below tables will be migrated from INT to BIGINT. Also, since `ROOT_WORKFLOW_ID` in `SUB_WORKFLOW_STORE_ENTRY` is a FK to `WORKFLOW_STORE_ENTRY_ID` in `WORKFLOW_STORE_ENTRY`
it is also being migrated from INT to BIGINT.
* DOCKER_HASH_STORE_ENTRY
* WORKFLOW_STORE_ENTRY
* SUB_WORKFLOW_STORE_ENTRY

## 84 Release Notes

### CromIAM enabled user checks
Expand Down
3 changes: 3 additions & 0 deletions database/migration/src/main/resources/changelog.xml
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,9 @@
<include file="changesets/enlarge_call_caching_detritus_entry_id.xml" relativeToChangelogFile="true" />
<include file="changesets/enlarge_call_caching_simpleton_entry_id.xml" relativeToChangelogFile="true" />
<include file="changesets/reset_call_caching_hash_entry_id_autoincrement.xml" relativeToChangelogFile="true" />
<include file="changesets/enlarge_docker_hash_store_entry_id.xml" relativeToChangelogFile="true" />
<include file="changesets/enlarge_workflow_store_entry_id.xml" relativeToChangelogFile="true" />
<include file="changesets/enlarge_sub_workflow_store_entry_id.xml" relativeToChangelogFile="true" />
<!-- REMINDER!
Before appending here, did you remember to include the 'objectQuotingStrategy="QUOTE_ALL_OBJECTS"' line in your changeset/xyz.xml...?
-->
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<databaseChangeLog objectQuotingStrategy="QUOTE_ALL_OBJECTS"
xmlns="http://www.liquibase.org/xml/ns/dbchangelog"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd">

<!-- BEGIN DOCKER_HASH_STORE_ENTRY_ID PK widening -->
<!-- For Postgresql there are 2 changesets: one for modifying the table column type, and another one for altering the autoincrementing sequence.
The other DBs can be refactored similarly with a single addAutoIncrement changeset. -->
<changeSet author="sshah" id="enlarge_docker_hash_store_entry_id" dbms="mysql,hsqldb,mariadb">
<addAutoIncrement
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you've tested that this won't reset the autoincrement to 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I am not sure. I copied what we had done for enlarge_job_store_ids.xml. I had run this changeset locally but it seems I don't have anything in DOCKER_HASH table or WORKFLOW_STORE table. I will run workflows that adds to this table and run the changeset again and see what happens

Copy link
Contributor Author

@salonishah11 salonishah11 Sep 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I verified in my local MySQL db that after the migration, if the PK in WORKFLOW_STORE_ENTRY was at 19 (all in Submitted state), it added new workflows at 20 onwards

columnName="DOCKER_HASH_STORE_ENTRY_ID"
columnDataType="BIGINT"
incrementBy="1"
tableName="DOCKER_HASH_STORE_ENTRY"
/>
</changeSet>

<changeSet author="sshah" id="postgresql_enlarge_docker_hash_store_entry_id" dbms="postgresql">
<modifyDataType
columnName="DOCKER_HASH_STORE_ENTRY_ID"
tableName="DOCKER_HASH_STORE_ENTRY"
newDataType="BIGINT"
/>
</changeSet>

<changeSet author="sshah" id="postgresql_enlarge_docker_hash_store_entry_id_seq" dbms="postgresql">
<preConditions onFail="MARK_RAN">
<!-- idempotency check (noop if the sequence is present and already consistent what the alter would do) -->
<sqlCheck expectedResult="0">
SELECT count(*)
FROM information_schema.sequences
WHERE sequence_name = 'DOCKER_HASH_STORE_ENTRY_DOCKER_HASH_STORE_ENTRY_ID_seq'
AND data_type = 'bigint';
</sqlCheck>
</preConditions>
<sql>alter sequence "DOCKER_HASH_STORE_ENTRY_DOCKER_HASH_STORE_ENTRY_ID_seq" as bigint;</sql>
</changeSet>
<!-- END DOCKER_HASH_STORE_ENTRY_ID PK widening -->

</databaseChangeLog>
salonishah11 marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<databaseChangeLog objectQuotingStrategy="QUOTE_ALL_OBJECTS"
xmlns="http://www.liquibase.org/xml/ns/dbchangelog"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd">

<!-- BEGIN SUB_WORKFLOW_STORE_ENTRY_ID PK widening -->
<!-- For Postgresql there are 2 changesets: one for modifying the table column type, and another one for altering the autoincrementing sequence.
The other DBs can be refactored similarly with a single addAutoIncrement changeset. -->
<changeSet author="sshah" id="enlarge_sub_workflow_store_entry_id" dbms="mysql,hsqldb,mariadb">
<addAutoIncrement
columnName="SUB_WORKFLOW_STORE_ENTRY_ID"
columnDataType="BIGINT"
incrementBy="1"
tableName="SUB_WORKFLOW_STORE_ENTRY"
/>
</changeSet>

<changeSet author="sshah" id="postgresql_enlarge_sub_workflow_store_entry_id" dbms="postgresql">
<modifyDataType
columnName="SUB_WORKFLOW_STORE_ENTRY_ID"
tableName="SUB_WORKFLOW_STORE_ENTRY"
newDataType="BIGINT"
/>
</changeSet>

<changeSet author="sshah" id="postgresql_enlarge_sub_workflow_store_entry_id_seq" dbms="postgresql">
<preConditions onFail="MARK_RAN">
<!-- idempotency check (noop if the sequence is present and already consistent what the alter would do) -->
<sqlCheck expectedResult="0">
SELECT count(*)
FROM information_schema.sequences
WHERE sequence_name = 'SUB_WORKFLOW_STORE_ENTRY_SUB_WORKFLOW_STORE_ENTRY_ID_seq'
AND data_type = 'bigint';
</sqlCheck>
</preConditions>
<sql>alter sequence "SUB_WORKFLOW_STORE_ENTRY_SUB_WORKFLOW_STORE_ENTRY_ID_seq" as bigint;</sql>
</changeSet>
<!-- END SUB_WORKFLOW_STORE_ENTRY_ID PK widening -->

</databaseChangeLog>
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<databaseChangeLog objectQuotingStrategy="QUOTE_ALL_OBJECTS"
xmlns="http://www.liquibase.org/xml/ns/dbchangelog"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd">

<!-- Drop the foreign key constraint from SUB_WORKFLOW_STORE_ENTRY to WORKFLOW_STORE_ENTRY to allow for the latter's PK to be widened. -->
<changeSet author="sshah" id="drop_sub_workflow_store_entry_root_workflow_id_fk" dbms="mysql,hsqldb,postgresql,mariadb">
<dropForeignKeyConstraint
baseTableName="SUB_WORKFLOW_STORE_ENTRY"
constraintName="FK_SUB_WORKFLOW_STORE_ENTRY_ROOT_WORKFLOW_ID"
/>
</changeSet>

<!-- BEGIN WORKFLOW_STORE_ENTRY_ID PK widening -->
<!-- For Postgresql there are 2 changesets: one for modifying the table column type, and another one for altering the autoincrementing sequence.
The other DBs can be refactored similarly with a single addAutoIncrement changeset. -->
<changeSet author="sshah" id="enlarge_workflow_store_entry_id" dbms="mysql,hsqldb,mariadb">
<addAutoIncrement
columnName="WORKFLOW_STORE_ENTRY_ID"
columnDataType="BIGINT"
incrementBy="1"
tableName="WORKFLOW_STORE_ENTRY"
/>
</changeSet>

<changeSet author="sshah" id="postgresql_enlarge_workflow_store_entry_id" dbms="postgresql">
<modifyDataType
columnName="WORKFLOW_STORE_ENTRY_ID"
tableName="WORKFLOW_STORE_ENTRY"
newDataType="BIGINT"
/>
</changeSet>

<changeSet author="sshah" id="postgresql_enlarge_workflow_store_entry_id_seq" dbms="postgresql">
<preConditions onFail="MARK_RAN">
<!-- idempotency check (noop if the sequence is present and already consistent what the alter would do) -->
<sqlCheck expectedResult="0">
SELECT count(*)
FROM information_schema.sequences
WHERE sequence_name = 'WORKFLOW_STORE_ENTRY_WORKFLOW_STORE_ENTRY_ID_seq'
AND data_type = 'bigint';
</sqlCheck>
</preConditions>
<sql>alter sequence "WORKFLOW_STORE_ENTRY_WORKFLOW_STORE_ENTRY_ID_seq" as bigint;</sql>
</changeSet>
<!-- END WORKFLOW_STORE_ENTRY_ID PK widening -->

<!-- BEGIN widening FK to match PK -->
<changeSet author="sshah" id="enlarge_sub_workflow_store_entry_fk" dbms="mysql,hsqldb,postgresql,mariadb">
<modifyDataType
tableName="SUB_WORKFLOW_STORE_ENTRY"
columnName="ROOT_WORKFLOW_ID"
newDataType="BIGINT"
/>
</changeSet>
<!-- END widening FK to match PK -->

<!-- MariaDB's FK NotNull constraint does not survive the widening above and must be recreated explicitly. -->
<!-- BEGIN Restoring FK NotNull constraint -->
<changeSet author="sshah" id="mariadb_not_null_constraint_sub_workflow_store_entry_fk" dbms="mariadb,mysql">
<addNotNullConstraint
tableName="SUB_WORKFLOW_STORE_ENTRY"
columnName="ROOT_WORKFLOW_ID"
columnDataType="BIGINT"
/>
</changeSet>
<!-- END Restoring FK NotNull constraint -->

<!-- Restoring the FK -->
<changeSet author="sshah" id="recreate_sub_workflow_store_entry_root_workflow_id_fk" dbms="mysql,hsqldb,postgresql,mariadb">
<addForeignKeyConstraint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the "takes very little time" assumption still hold if we are re-adding indexes as part of this changeset?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this was measured on prod clone

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it would be better to measure how much time it takes to add the FK constraint back in prod clone before merging this PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a good idea to double check, just in case it goes from 20ms to 20 hours

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The time to re-add this FK constraint on prod clone is 3s 94ms using the below query

ALTER TABLE SUB_WORKFLOW_STORE_ENTRY
ADD CONSTRAINT FK_SUB_WORKFLOW_STORE_ENTRY_ROOT_WORKFLOW_ID
FOREIGN KEY (`ROOT_WORKFLOW_ID`) REFERENCES `WORKFLOW_STORE_ENTRY` (`WORKFLOW_STORE_ENTRY_ID`) ON DELETE CASCADE;

constraintName="FK_SUB_WORKFLOW_STORE_ENTRY_ROOT_WORKFLOW_ID"
baseColumnNames="ROOT_WORKFLOW_ID"
baseTableName="SUB_WORKFLOW_STORE_ENTRY"
referencedTableName="WORKFLOW_STORE_ENTRY"
referencedColumnNames="WORKFLOW_STORE_ENTRY_ID"
onDelete="CASCADE"
/>
</changeSet>

</databaseChangeLog>
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ trait DockerHashStoreEntryComponent {
import driver.api._

class DockerHashStoreEntries(tag: Tag) extends Table[DockerHashStoreEntry](tag, "DOCKER_HASH_STORE_ENTRY") {
def dockerHashStoreEntryId = column[Int]("DOCKER_HASH_STORE_ENTRY_ID", O.PrimaryKey, O.AutoInc)
def dockerHashStoreEntryId = column[Long]("DOCKER_HASH_STORE_ENTRY_ID", O.PrimaryKey, O.AutoInc)

def workflowExecutionUuid = column[String]("WORKFLOW_EXECUTION_UUID", O.Length(255))

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,9 @@ trait SubWorkflowStoreEntryComponent {
import driver.api._

class SubWorkflowStoreEntries(tag: Tag) extends Table[SubWorkflowStoreEntry](tag, "SUB_WORKFLOW_STORE_ENTRY") {
def subWorkflowStoreEntryId = column[Int]("SUB_WORKFLOW_STORE_ENTRY_ID", O.PrimaryKey, O.AutoInc)
def subWorkflowStoreEntryId = column[Long]("SUB_WORKFLOW_STORE_ENTRY_ID", O.PrimaryKey, O.AutoInc)

def rootWorkflowId = column[Int]("ROOT_WORKFLOW_ID")
def rootWorkflowId = column[Long]("ROOT_WORKFLOW_ID")

def parentWorkflowExecutionUuid = column[String]("PARENT_WORKFLOW_EXECUTION_UUID", O.Length(255))

Expand Down Expand Up @@ -40,7 +40,7 @@ trait SubWorkflowStoreEntryComponent {
val subWorkflowStoreEntryIdsAutoInc = subWorkflowStoreEntries returning subWorkflowStoreEntries.map(_.subWorkflowStoreEntryId)

val subWorkflowStoreEntriesForRootWorkflowId = Compiled(
(rootWorkflowId: Rep[Int]) => for {
(rootWorkflowId: Rep[Long]) => for {
subWorkflowStoreEntry <- subWorkflowStoreEntries
if subWorkflowStoreEntry.rootWorkflowId === rootWorkflowId
} yield subWorkflowStoreEntry
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ trait WorkflowStoreEntryComponent {
import driver.api._

class WorkflowStoreEntries(tag: Tag) extends Table[WorkflowStoreEntry](tag, "WORKFLOW_STORE_ENTRY") {
def workflowStoreEntryId = column[Int]("WORKFLOW_STORE_ENTRY_ID", O.PrimaryKey, O.AutoInc)
def workflowStoreEntryId = column[Long]("WORKFLOW_STORE_ENTRY_ID", O.PrimaryKey, O.AutoInc)

def workflowExecutionUuid = column[String]("WORKFLOW_EXECUTION_UUID", O.Length(255))

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,5 @@ case class DockerHashStoreEntry
dockerTag: String,
dockerHash: String,
dockerSize: Option[Long],
dockerHashStoreEntryId: Option[Int] = None
dockerHashStoreEntryId: Option[Long] = None
)
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,11 @@ package cromwell.database.sql.tables

case class SubWorkflowStoreEntry
(
rootWorkflowId: Option[Int],
rootWorkflowId: Option[Long],
parentWorkflowExecutionUuid: String,
callFullyQualifiedName: String,
callIndex: Int,
callAttempt: Int,
subWorkflowExecutionUuid: String,
subWorkflowStoreEntryId: Option[Int] = None
subWorkflowStoreEntryId: Option[Long] = None
)
Original file line number Diff line number Diff line change
Expand Up @@ -21,5 +21,5 @@ case class WorkflowStoreEntry
cromwellId: Option[String],
heartbeatTimestamp: Option[Timestamp],
hogGroup: Option[String],
workflowStoreEntryId: Option[Int] = None
workflowStoreEntryId: Option[Long] = None
)