-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
workload: give each kvOp a separate sql.Conn #30811
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
omg 🤦♂️ nice find!
Is it possible for other workloads to be suffering from the same sql.DB internal contention?
Reviewed 1 of 1 files at r1.
Reviewable status: complete! 1 of 0 LGTMs obtained
Very possible! We should get #30810 in and then do an audit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hilarious. It's good to hear it's just a client issue!
🙈 that's ridiculous. Personally, I'd add a comment in the code for posterity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice find! TIL about sql.Stmt.mu
.
Reviewable status: complete! 1 of 0 LGTMs obtained
d9038ba
to
0883209
Compare
In cockroachdb#26178, we saw that throughput hit a cliff while running `kv` at high concurrency levels. We spent a while debugging the issue, but nothing stood out in the `cockroach` process. Eventually I installed pprof http handlers in `workload` (cockroachdb#30810). The CPU and heap profiles looked fine but the mutex profile revealed that **99.94%** of mutex contention was in `sql.(*Rows).Next`. It turns out that this method manipulates a lock that's scoped to the same degree as its prepared statement. Since `readStmt` was prepared on the `sql.DB`, all kvOps were contending on the same lock in `sql.(*Rows).Next`. The fix is to give each `kvOp` its own `sql.Conn` and prepare the statement with a connection-level scope. There are probably other areas in `workload` that could use the same kind of change. Before this change, `kv100 --concurrency=400` in the configuration discussed in cockroachdb#26178 topped out at around 80,000 qps. After this change, it tops out at around 250,000 qps. Release note: None
0883209
to
b9ffacb
Compare
Done. TFTRs! bors r+ |
30811: workload: give each kvOp a separate sql.Conn r=nvanbenschoten a=nvanbenschoten In #26178, we saw that throughput hit a cliff while running `kv` at high concurrency levels. We spent a while debugging the issue, but nothing stood out in the `cockroach` process. Eventually, I installed pprof http handlers in `workload` (#30810). The CPU and heap profiles looked fine but the mutex profile revealed that **99.94%** of mutex contention was in `sql.(*Rows).Next`. It turns out that this method manipulates a lock that's scoped to the same degree as its prepared statement. Since `readStmt` was prepared on the `sql.DB`, all kvOps were contending on the same lock in `sql.(*Rows).Next`. The fix is to give each `kvOp` its own `sql.Conn` and prepare the statement with a connection-level scope. There are probably other places in `workload` that could use the same kind of change. Before this change, `kv100 --concurrency=400` in the configuration discussed in #26178 topped out at around 80,000 qps. After this change, it tops out at around 250,000 qps. Release note: None Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Build succeeded |
pkg/workload/kv/kv.go, line 183 at r2 (raw file):
@nvanbenschoten did you want |
🤦♂️ yes. Thanks for catching that. |
30482: workload/kv: print the highest sequence number r=andreimatei a=andreimatei At the end of the kv workload, print the sequnence number of the highest row written so that it can be used in a next run: for example, you might want to prepopulate the db with --read_percent=0 and then perform only reads with --read_percent=100. If you don't pass --write-seq (or you don't know what to put in it), all the reads from --read_percent=100 would try to read a non-exitent row with sentinel sequence number 0. Release note: None 30882: workload: actually prepare writeStmt with Conn r=nvanbenschoten a=nvanbenschoten See #30811 (comment). Release note: None 30944: workload: teach interleavedpartitioned about retryable transactions r=BramGruneir a=BramGruneir Also some other speedups and cleanups while I was in there. Fixes #28567. Release note: None Co-authored-by: Andrei Matei <andrei@cockroachlabs.com> Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com> Co-authored-by: Bram Gruneir <bram@cockroachlabs.com>
Has anyone tried https://github.com/jackc/pgx in workload? I'm really curious if it is faster. |
I tried using it as a driver for |
Yes, using their API directly is what I should have specified in my suggestion. |
I played around with it a while ago, using their API directly, and also found it to be slower. I was surprised but didn't put much time into trying to figure out what was causing the slowdown. |
This commit gives ycsb the same treatment we gave kv in cockroachdb#30811 (4 years ago!). It removes a throughput bottleneck in the workload client by giving each ycsb worker goroutine its own `sql.Conn`, instead of having them all pull from a sufficiently sized `sql.DB` pool. Doing so avoids mutex contention. This commit also goes a step further than cockroachdb#30811 by creating multiple `sql.DB` objects and pulling only a bounded number of connections from each. This further avoids mutex contention and removes the next bottleneck. Without these changes, a ycsb driver on a 64 vCPU machine could push about 200k qps before hitting a ceiling. With it, I've seen the driver push upt to about 500k qps before CRDB itself became the bottlenck. Release justification: workload only.
86841: workload/ycsb: give each worker a separate sql.Conn r=nvanbenschoten a=nvanbenschoten This commit gives ycsb the same treatment we gave kv in #30811 (4 years ago!). It removes a throughput bottleneck in the workload client by giving each ycsb worker goroutine its own `sql.Conn`, instead of having them all pull from a sufficiently sized `sql.DB` pool. Doing so avoids mutex contention. This commit also goes a step further than #30811 by creating multiple `sql.DB` objects and pulling only a bounded number of connections from each. This further avoids mutex contention and removes the next bottleneck. Without these changes, a ycsb driver on a 64 vCPU machine could push about 200k qps before hitting a ceiling. With it, I've seen the driver push upt to about 500k qps before CRDB itself became the bottlenck. Release justification: workload only. Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
This commit gives ycsb the same treatment we gave kv in #30811 (4 years ago!). It removes a throughput bottleneck in the workload client by giving each ycsb worker goroutine its own `sql.Conn`, instead of having them all pull from a sufficiently sized `sql.DB` pool. Doing so avoids mutex contention. This commit also goes a step further than #30811 by creating multiple `sql.DB` objects and pulling only a bounded number of connections from each. This further avoids mutex contention and removes the next bottleneck. Without these changes, a ycsb driver on a 64 vCPU machine could push about 200k qps before hitting a ceiling. With it, I've seen the driver push upt to about 500k qps before CRDB itself became the bottlenck. Release justification: workload only.
This commit gives ycsb the same treatment we gave kv in #30811 (4 years ago!). It removes a throughput bottleneck in the workload client by giving each ycsb worker goroutine its own `sql.Conn`, instead of having them all pull from a sufficiently sized `sql.DB` pool. Doing so avoids mutex contention. This commit also goes a step further than #30811 by creating multiple `sql.DB` objects and pulling only a bounded number of connections from each. This further avoids mutex contention and removes the next bottleneck. Without these changes, a ycsb driver on a 64 vCPU machine could push about 200k qps before hitting a ceiling. With it, I've seen the driver push upt to about 500k qps before CRDB itself became the bottlenck. Release justification: workload only.
In #26178, we saw that throughput hit a cliff while running
kv
athigh concurrency levels. We spent a while debugging the issue, but
nothing stood out in the
cockroach
process. Eventually, I installedpprof http handlers in
workload
(#30810). The CPU and heap profileslooked fine but the mutex profile revealed that 99.94% of mutex
contention was in
sql.(*Rows).Next
.It turns out that this method manipulates a lock that's scoped to
the same degree as its prepared statement. Since
readStmt
wasprepared on the
sql.DB
, all kvOps were contending on the same lockin
sql.(*Rows).Next
.The fix is to give each
kvOp
its ownsql.Conn
and prepare thestatement with a connection-level scope. There are probably other places
in
workload
that could use the same kind of change.Before this change,
kv100 --concurrency=400
in the configurationdiscussed in #26178 topped out at around 80,000 qps. After this change,
it tops out at around 250,000 qps.
Release note: None