Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

importer: Check whether file exist before importing (#9522) #9544

Merged
merged 6 commits into from
Jan 26, 2021

Conversation

ti-srebot
Copy link
Contributor

@ti-srebot ti-srebot commented Jan 22, 2021

cherry-pick #9522 to release-4.0
You can switch your code base to this Pull Request by using git-extras:

# In tikv repo:
git pr https://github.com/tikv/tikv/pull/9544

After apply modifications, you can push your change to this PR via:

git push git@github.com:ti-srebot/tikv.git pr/9544:release-4.0-f4be3a0bf2fa

What problem does this PR solve?

Issue Number: close #9496

Problem Summary:
If TiKV has applied one command of ingest but tidb-lightning(import-client) thinks this request has failed, it would send the same ingest-request to TiKV again, which may cause TiKV panic because the file has been ingested and could not be found.
There is many case which may cause request failed while it has been applied. Such as:

  1. Network lost some packets of ingest-response.
  2. The command had been copied to more than half of the members of raft-group but the original leader crashed before the entry of this command committed, then the new leader committed this entry but the client has received fail messages by the response from the original leader.

What is changed and how it works?

We can propose one ingest-request when all the following conditions are met:

  1. There is not any pending ingest-request for this sst.
  2. The current leader has applied to current term, and we can ensure that the file which we want to ingest exist.
  3. The ingest file must be removed before we return response to client, so that the new request will never see the same file again.

Related changes

  • PR to update pingcap/docs/pingcap/docs-cn:
  • PR to update pingcap/tidb-ansible:
  • Need to cherry-pick to the release branch

Check List

Tests

  • Unit test

Side effects

  • Performance regression
    • It may increase the duration of ingest request.
  • Breaking backward compatibility

Release note

  • No release note.

Signed-off-by: ti-srebot <ti-srebot@pingcap.com>
@ti-srebot
Copy link
Contributor Author

/run-all-tests

@ti-srebot ti-srebot added component/backup-restore Component: backup, import, external_storage sig/raft Component: Raft, RaftStore, etc. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. type/cherry-pick Type: PR - Cherry pick labels Jan 22, 2021
@ti-srebot ti-srebot added this to the v4.0.10 milestone Jan 22, 2021
@ti-srebot
Copy link
Contributor Author

@Little-Wallace you're already a collaborator in bot's repo.

@ti-chi-bot ti-chi-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 22, 2021
@@ -361,8 +376,14 @@ impl<W: WriteBatch + WriteBatchVecExt<RocksEngine>> ApplyContext<W> {
exec_ctx: None,
use_delete_range: cfg.use_delete_range,
yield_duration: cfg.apply_yield_duration.0,
<<<<<<< HEAD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merge conflict.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>
@ti-chi-bot ti-chi-bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jan 24, 2021
@jebter jebter modified the milestones: v4.0.10, v4.0.11 Jan 25, 2021
Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>
@gengliqi
Copy link
Member

/lgtm

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Jan 25, 2021
@Little-Wallace
Copy link
Contributor

/merge

@ti-chi-bot
Copy link
Member

@Little-Wallace: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

@Little-Wallace: /merge is only allowed for the committers in list.

In response to this:

/merge

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

Copy link
Member

@NingLin-P NingLin-P left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/LGTM

@ti-chi-bot
Copy link
Member

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • NingLin-P
  • gengliqi

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by writing /lgtm in a comment.
Reviewer can cancel approval by writing /lgtm cancel in a comment.

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Jan 25, 2021
@NingLin-P
Copy link
Member

/merge

@ti-chi-bot
Copy link
Member

@NingLin-P: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: e848c43

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Jan 26, 2021
@Little-Wallace
Copy link
Contributor

/test

3 similar comments
@Little-Wallace
Copy link
Contributor

/test

@Little-Wallace
Copy link
Contributor

/test

@Little-Wallace
Copy link
Contributor

/test

@ti-chi-bot ti-chi-bot merged commit 9a98501 into tikv:release-4.0 Jan 26, 2021
gengliqi pushed a commit to gengliqi/tikv that referenced this pull request Feb 20, 2021
…#9544)

cherry-pick tikv#9522 to release-4.0
You can switch your code base to this Pull Request by using [git-extras](https://github.com/tj/git-extras):
```bash
# In tikv repo:
git pr tikv#9544
```

After apply modifications, you can push your change to this PR via:
```bash
git push git@github.com:ti-srebot/tikv.git pr/9544:release-4.0-f4be3a0bf2fa
```

---

<!--
Thank you for contributing to TiKV!

If you haven't already, please read TiKV's [CONTRIBUTING](https://github.com/tikv/tikv/blob/master/CONTRIBUTING.md) document.

If you're unsure about anything, just ask; somebody should be along to answer within a day or two.

PR Title Format:
1. module [, module2, module3]: what's changed
2. *: what's changed

If you want to open the **Challenge Program** pull request, please use the following template:
https://raw.githubusercontent.com/tikv/.github/master/.github/PULL_REQUEST_TEMPLATE/challenge-program.md
You can use it with query parameters: https://github.com/tikv/tikv/compare/master...${you branch}?template=challenge-program.md
-->

### What problem does this PR solve?

Issue Number: close tikv#9496

Problem Summary:
If TiKV has applied one command of ingest but tidb-lightning(import-client) thinks this request has failed, it would send the same ingest-request to TiKV again, which may cause TiKV panic because the file has been ingested and could not be found.
There is many case which may cause request failed while it has been applied. Such as:
1. Network lost some packets of ingest-response.
2. The command had been copied to more than half of the members of raft-group but the original leader crashed before the entry of this command committed, then the new leader committed this entry but the client has received  fail messages by the response from the original leader.


### What is changed and how it works?

We can propose one ingest-request when all the following conditions are met:
1. There is not any pending ingest-request for this sst.
2. The current leader has applied to current term, and we can ensure that the file which we want to ingest exist.
3. The ingest file must be removed before we return response to client, so that the new request will never see the same file again.

### Related changes

- PR to update `pingcap/docs`/`pingcap/docs-cn`:
- PR to update `pingcap/tidb-ansible`:
- Need to cherry-pick to the release branch

### Check List <!--REMOVE the items that are not applicable-->

Tests <!-- At least one of them must be included. -->

- Unit test

Side effects

- Performance regression
    - It may increase the duration of ingest request.
- Breaking backward compatibility

### Release note <!-- bugfixes or new feature need a release note -->

- No release note.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/backup-restore Component: backup, import, external_storage sig/migrate sig/raft Component: Raft, RaftStore, etc. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/cherry-pick Type: PR - Cherry pick
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants