-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
import: Optimize importer overall processing speed #4245
Conversation
Related unexpected memory usage issue: #4257 |
@DorianZheng Please remove the |
This makes the SST size to upload larger, which reduces the number of SST files needed to ingested, and improve overall speed. Signed-off-by: kennytm <kennytm@gmail.com>
The bottle-neck of Importer is in SST ingestion on the TiKV side, not disk I/O. Storing these SST in RAM will just increase the chance Importer getting killed by OOM. Signed-off-by: Lonng <chris@lonng.org>
Signed-off-by: kennytm <kennytm@gmail.com>
Jobs are now processed dynamically using a channel instead of predestined, to prevent a single slow thread blocking the entire process. Signed-off-by: Lonng <chris@lonng.org>
Implemented a maximum speed limit for uploading from Importer to TiKV, using Token Bucket algorithm. The speed limit is needed to avoid saturating the network bandwidth which causes PD to assume TiKV nodes went down due to heartbeat not going through. Signed-off-by: kennytm <kennytm@gmail.com>
Signed-off-by: Lonng <chris@lonng.org>
Signed-off-by: kennytm <kennytm@gmail.com>
Signed-off-by: Lonng <chris@lonng.org>
/test |
@huachaohuang PTAL |
This PR is a combination of multiple optimizations, we should split it into multiple single-purpose PRs. |
@huachaohuang if we split this into 8 PRs, do we also file 8 additional cherrypick-to-2.1 PRs or a single one combining all 8? |
@kennytm cherry picks can be combined into one PR. |
@huachaohuang I've split this into 5 PRs (see the references above). There are 2 remaining PRs which rely on #4349, I'll file them after #4349 is merged. |
All extracted PRs merged. |
What have you changed? (mandatory)
Refactor concurrency mode of
ImportJob
Implement the preemptive concurrency mode.
Before
After
Memory optimization
Env
instead of memory based.Redesign retry range strategy
Retry entire range when any sst import failed.
Implement a speed limit for tikv-importer
Implemented a maximum speed limit for uploading from Importer to TiKV, using Token Bucket algorithm. The speed limit is needed to avoid saturating the network bandwidth which causes PD to assume TiKV nodes went down due to heartbeat not going through.
What are the type of the changes? (mandatory)
The currently defined types are listed below, please pick one of the types for this PR by removing the others:
How has this PR been tested? (mandatory)
Manual tested.
Does this PR affect documentation (docs) update? (mandatory)
No.
Does this PR affect tidb-ansible update? (mandatory)
No.
Refer to a related PR or issue link (optional)
Benchmark result if necessary (optional)
Add a few positive/negative examples (optional)