Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ddl: fix the ddl txn commit may be conflict with reorg txn #24668

Closed
wants to merge 5 commits into from

Conversation

AilinKid
Copy link
Contributor

@AilinKid AilinKid commented May 14, 2021

Signed-off-by: AilinKid 314806019@qq.com

What problem does this PR solve?

Issue Number: close #24427

Problem Summary:

What is changed and how it works?

What's Changed:

// runReorgJob is used to do `READ` action of reorg info in ddl txn.
// If we want to do `WRITE` action of recording some reorg handle down, we need to additional kv txn.
// Otherwise, the below write conflicts will occur.
//
// ddl txn ------------+-------------------------------+
// (RunInKVTxn: start) |                               | (ddl done successfully)
//                     | (Write)                       | (RunInKVTxn committed fail: write conflict)
//                     V                               V
//               reorg handle
//                                    ^
//                                    | (Update handle, eg: change element)
//                                    | (RunInKVTxn: committed instantly)
// reorg txn--------------------------+
//
// For this case:
// Let's take the ddl txn as a Daemon thread, it isn't response for writing the reorg handle down, but care for reading
// reorg result. Because read won't be conflict with any write action of the reorg txn.

Check List

Tests

  • Unit test
  • Integration test

Release note

  • ddl: fix the ddl txn commit fail cause it's conflict with reorg txn

.
Signed-off-by: AilinKid <314806019@qq.com>
@ti-chi-bot
Copy link
Member

[REVIEW NOTIFICATION]

This pull request has not been approved.

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by writing /lgtm in a comment.
Reviewer can cancel approval by writing /lgtm cancel in a comment.

@ti-chi-bot ti-chi-bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label May 14, 2021
@ti-srebot
Copy link
Contributor

@github-actions github-actions bot added the sig/sql-infra SIG: SQL Infra label May 14, 2021
Signed-off-by: AilinKid <314806019@qq.com>
@AilinKid AilinKid changed the title ddl: fix the ddl txn commit fail cause it's conflict with reorg txn[DNM] ddl: fix the ddl txn commit may be conflict with reorg txn May 14, 2021
@@ -1024,10 +1024,9 @@ func (w *worker) doModifyColumnTypeWithData(
return ver, nil
}
if needRollbackData(err) {
if err1 := t.RemoveDDLReorgHandle(job, reorgInfo.elements); err1 != nil {
if err1 := reorgInfo.CleanReorgMeta(); err1 != nil {
logutil.BgLogger().Warn("[ddl] run modify column job failed, RemoveDDLReorgHandle failed, can't convert job to rollback",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need update this log?

@@ -561,7 +561,7 @@ func (w *worker) onCreateIndex(d *ddlCtx, t *meta.Meta, job *model.Job, isPK boo
if kv.ErrKeyExists.Equal(err) || errCancelledDDLJob.Equal(err) || errCantDecodeRecord.Equal(err) {
logutil.BgLogger().Warn("[ddl] run add index job failed, convert job to rollback", zap.String("job", job.String()), zap.Error(err))
ver, err = convertAddIdxJob2RollbackJob(t, job, tblInfo, indexInfo, err)
if err1 := t.RemoveDDLReorgHandle(job, reorgInfo.elements); err1 != nil {
if err1 := reorgInfo.CleanReorgMeta(); err1 != nil {
logutil.BgLogger().Warn("[ddl] run add index job failed, convert job to rollback, RemoveDDLReorgHandle failed", zap.String("job", job.String()), zap.Error(err1))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

@@ -707,3 +724,14 @@ func (r *reorgInfo) UpdateReorgMeta(startKey kv.Key) error {
}
return nil
}

func (r *reorgInfo) CleanReorgMeta() error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that this function is similar to RemoveDDLReorgHandle, could we use a similar name?

@@ -220,7 +237,7 @@ func (w *worker) runReorgJob(t *meta.Meta, reorgInfo *reorgInfo, tblInfo *model.
case model.ActionModifyColumn:
metrics.GetBackfillProgressByLabel(metrics.LblModifyColumn).Set(100)
}
if err1 := t.RemoveDDLReorgHandle(job, reorgInfo.elements); err1 != nil {
if err1 := reorgInfo.CleanReorgMeta(); err1 != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This txn and DDL txn are not the same txn. There may be data and schema out of sync. I think this plan can be further optimized

@AilinKid
Copy link
Contributor Author

AilinKid commented May 24, 2021

This pr is regarded as a kind of improvement.
For clear, we can clean the element handle in the finishDDLJob which is used to clean remained indexes. I will do it later.

@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 29, 2021
@ti-chi-bot
Copy link
Member

@AilinKid: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@tangenta
Copy link
Contributor

tangenta commented Aug 9, 2021

@AilinKid Please resolve the conflicts.

@AilinKid
Copy link
Contributor Author

AilinKid commented Dec 3, 2021

@AilinKid Please resolve the conflicts.

got

@AilinKid
Copy link
Contributor Author

won't fix it yet, it will be refactored as a whole latter

@AilinKid AilinKid closed this Dec 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. sig/sql-infra SIG: SQL Infra size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ddl: reorg handle's update txn is conflict with the ddl job's txn
5 participants