Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raftstore: remove the local reader thread #4558

Merged
merged 52 commits into from
May 23, 2019

Conversation

5kbpers
Copy link
Member

@5kbpers 5kbpers commented Apr 23, 2019

What have you changed? (mandatory)

In TiKV, all read requests are collected into batches and executed by a thread called local-reader.
We found that this thread has become a bottleneck of read performance.
This PR removes this thread and moves the execution of read request to the readpool, which improves the read performance and reduces context switches.

What are the type of the changes? (mandatory)

  • Improvement (change which is an improvement to an existing feature)

How has this PR been tested? (mandatory)

Unit test, integration test, Jepsen test(still running, don't find any problem by now)

Does this PR affect tidb-ansible update? (mandatory)

Yes. see pingcap/tidb-ansible#753

Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
@5kbpers
Copy link
Member Author

5kbpers commented Apr 24, 2019

/run-integration-tests

@zhangjinpeng87
Copy link
Member

zhangjinpeng87 commented Apr 24, 2019

After this pr merged, don't forget to modify tikv's grafana in tidb-ansible repo.

@ngaut
Copy link
Member

ngaut commented Apr 24, 2019

Do we have any benchmark results?

@5kbpers
Copy link
Member Author

5kbpers commented Apr 24, 2019

After this pr merged, don't forget to modify tikv's grafana in tidb-ansible repo.

@zhangjinpeng1987 OK

@5kbpers
Copy link
Member Author

5kbpers commented Apr 24, 2019

Do we have any benchmark results?

@ngaut I wrote a test log on confluence: 2019-04 Remove local reader thread

@ngaut
Copy link
Member

ngaut commented Apr 24, 2019

Could you post a brief result here?

@5kbpers
Copy link
Member Author

5kbpers commented Apr 24, 2019

YCSB Stress Testing

TiKV Configuration

readpool.storage.high-concurrency: 24
readpool.storage.normal-concurrency: 24
readpool.storage.low-concurrency: 24
readpool.coprocessor.high-concurrency: 24
readpool.coprocessor.normal-concurrency: 24
readpool.coprocessor.low-concurrency: 24
server.grpc-concurrency: 12
server.stats-concurrency: 5

Deploy single TiKV instance and single PD instance.

YCSB Configuration

use go-ycsb.
workload:

recordcount=10000000
operationcount=100000000
workload=core
readallfields=true
readproportion=1
updateproportion=0
scanproportion=0
insertproportion=0
requestdistribution=uniform

Two YCSB clients, script:

./bin/go-ycsb run tikv -P workload-point-get -p tikv.type="txn" -p threadcount=2048

Result

Both throughput and latency get pretty improvement, and the frequency of context switch has decrease.
image
image

@5kbpers
Copy link
Member Author

5kbpers commented Apr 24, 2019

Could you post a brief result here?

@ngaut OK, I have just posted it, see the previous comment.

@ngaut
Copy link
Member

ngaut commented Apr 24, 2019

Thanks.

Signed-off-by: 5kbpers <tangminghua@pingcap.com>
@5kbpers
Copy link
Member Author

5kbpers commented Apr 29, 2019

/run-all-tests

1 similar comment
@5kbpers
Copy link
Member Author

5kbpers commented May 5, 2019

/run-all-tests

@5kbpers 5kbpers changed the title [DNM] raftstore: remove local reader thread raftstore: remove local reader thread May 6, 2019
@hicqu hicqu mentioned this pull request May 21, 2019
5 tasks
src/raftstore/store/worker/read.rs Show resolved Hide resolved
src/raftstore/store/worker/read.rs Outdated Show resolved Hide resolved
src/raftstore/store/worker/read.rs Show resolved Hide resolved
src/raftstore/store/worker/read.rs Outdated Show resolved Hide resolved
@@ -370,14 +370,6 @@ impl Peer {
pub fn activate<T, C>(&self, ctx: &PollContext<T, C>) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please update the comment, and I prefer writting

let mut meta = ctx.store_meta.lock().unwrap();
            meta.readers
                .insert(self.region_id, ReadDelegate::from_peer(self));

here rather than in post_raft_ready_append

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess seperating them is better for now, because activate is also called in fsm/peer.rs, where the meta is already locked. It will introduce much more changes.

src/raftstore/store/peer.rs Outdated Show resolved Hide resolved
@hicqu
Copy link
Contributor

hicqu commented May 22, 2019

PTAL @Connor1996 thanks

Connor1996
Connor1996 previously approved these changes May 22, 2019
Copy link
Member

@Connor1996 Connor1996 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hicqu
Copy link
Contributor

hicqu commented May 22, 2019

/run-all-tests

Connor1996
Connor1996 previously approved these changes May 22, 2019
Copy link
Member

@Connor1996 Connor1996 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@overvenus overvenus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM

src/raftstore/store/peer.rs Outdated Show resolved Hide resolved
return;
} else {
// Remove delegate for updating it by next cmd execution.
self.delegates.borrow_mut().remove(&region_id);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not move this line to L339?

Some(delegate) => {
fail_point!("localreader_on_find_delegate");
delegate
match delegate.take() {
Copy link
Member

@zhangjinpeng87 zhangjinpeng87 May 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why take away? How about the following read requests for this region?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because of lifetime. After handle a request, it will be put into the hashmap back.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be a performance issue, please pay attention to it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

Copy link
Member

@Connor1996 Connor1996 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hicqu
Copy link
Contributor

hicqu commented May 23, 2019

/run-all-tests

@hicqu hicqu merged commit 0a9ebfc into tikv:master May 23, 2019
jswh pushed a commit to jswh/tikv that referenced this pull request May 27, 2019
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
sticnarf pushed a commit to sticnarf/tikv that referenced this pull request Oct 27, 2019
Signed-off-by: 5kbpers <tangminghua@pingcap.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/raft Component: Raft, RaftStore, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.