workload: give each kvOp a separate sql.Conn #30811

nvanbenschoten · 2018-10-01T02:29:44Z

In #26178, we saw that throughput hit a cliff while running kv at
high concurrency levels. We spent a while debugging the issue, but
nothing stood out in the cockroach process. Eventually, I installed
pprof http handlers in workload (#30810). The CPU and heap profiles
looked fine but the mutex profile revealed that 99.94% of mutex
contention was in sql.(*Rows).Next.

It turns out that this method manipulates a lock that's scoped to
the same degree as its prepared statement. Since readStmt was
prepared on the sql.DB, all kvOps were contending on the same lock
in sql.(*Rows).Next.

The fix is to give each kvOp its own sql.Conn and prepare the
statement with a connection-level scope. There are probably other places
in workload that could use the same kind of change.

Before this change, kv100 --concurrency=400 in the configuration
discussed in #26178 topped out at around 80,000 qps. After this change,
it tops out at around 250,000 qps.

Release note: None

cockroach-teamcity · 2018-10-01T02:29:49Z

This change is

benesch

omg 🤦‍♂️ nice find!

Is it possible for other workloads to be suffering from the same sql.DB internal contention?

Reviewed 1 of 1 files at r1.
Reviewable status: complete! 1 of 0 LGTMs obtained

nvanbenschoten · 2018-10-01T02:59:07Z

Is it possible for other workloads to be suffering from the same sql.DB internal contention?

Very possible! We should get #30810 in and then do an audit.

a-robinson

Hilarious. It's good to hear it's just a client issue!

tbg · 2018-10-01T08:30:06Z

🙈 that's ridiculous. Personally, I'd add a comment in the code for posterity.

petermattis

Nice find! TIL about sql.Stmt.mu.

Reviewable status: complete! 1 of 0 LGTMs obtained

In cockroachdb#26178, we saw that throughput hit a cliff while running `kv` at high concurrency levels. We spent a while debugging the issue, but nothing stood out in the `cockroach` process. Eventually I installed pprof http handlers in `workload` (cockroachdb#30810). The CPU and heap profiles looked fine but the mutex profile revealed that **99.94%** of mutex contention was in `sql.(*Rows).Next`. It turns out that this method manipulates a lock that's scoped to the same degree as its prepared statement. Since `readStmt` was prepared on the `sql.DB`, all kvOps were contending on the same lock in `sql.(*Rows).Next`. The fix is to give each `kvOp` its own `sql.Conn` and prepare the statement with a connection-level scope. There are probably other areas in `workload` that could use the same kind of change. Before this change, `kv100 --concurrency=400` in the configuration discussed in cockroachdb#26178 topped out at around 80,000 qps. After this change, it tops out at around 250,000 qps. Release note: None

nvanbenschoten · 2018-10-01T11:07:41Z

Personally, I'd add a comment in the code for posterity.

Done.

TFTRs!

bors r+

30811: workload: give each kvOp a separate sql.Conn r=nvanbenschoten a=nvanbenschoten In #26178, we saw that throughput hit a cliff while running `kv` at high concurrency levels. We spent a while debugging the issue, but nothing stood out in the `cockroach` process. Eventually, I installed pprof http handlers in `workload` (#30810). The CPU and heap profiles looked fine but the mutex profile revealed that **99.94%** of mutex contention was in `sql.(*Rows).Next`. It turns out that this method manipulates a lock that's scoped to the same degree as its prepared statement. Since `readStmt` was prepared on the `sql.DB`, all kvOps were contending on the same lock in `sql.(*Rows).Next`. The fix is to give each `kvOp` its own `sql.Conn` and prepare the statement with a connection-level scope. There are probably other places in `workload` that could use the same kind of change. Before this change, `kv100 --concurrency=400` in the configuration discussed in #26178 topped out at around 80,000 qps. After this change, it tops out at around 250,000 qps. Release note: None Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>

craig · 2018-10-01T11:39:31Z

Build succeeded

GitHub CI (Cockroach)

RaduBerinde · 2018-10-02T15:11:19Z

pkg/workload/kv/kv.go, line 183 at r2 (raw file):

			return workload.QueryLoad{}, err
		}
		writeStmt, err := db.PrepareContext(ctx, writeStmtStr)

@nvanbenschoten did you want conn here?

nvanbenschoten · 2018-10-02T16:48:49Z

did you want conn here?

🤦‍♂️ yes. Thanks for catching that.

30482: workload/kv: print the highest sequence number r=andreimatei a=andreimatei At the end of the kv workload, print the sequnence number of the highest row written so that it can be used in a next run: for example, you might want to prepopulate the db with --read_percent=0 and then perform only reads with --read_percent=100. If you don't pass --write-seq (or you don't know what to put in it), all the reads from --read_percent=100 would try to read a non-exitent row with sentinel sequence number 0. Release note: None 30882: workload: actually prepare writeStmt with Conn r=nvanbenschoten a=nvanbenschoten See #30811 (comment). Release note: None 30944: workload: teach interleavedpartitioned about retryable transactions r=BramGruneir a=BramGruneir Also some other speedups and cleanups while I was in there. Fixes #28567. Release note: None Co-authored-by: Andrei Matei <andrei@cockroachlabs.com> Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com> Co-authored-by: Bram Gruneir <bram@cockroachlabs.com>

maddyblue · 2018-10-09T19:15:44Z

Has anyone tried https://github.com/jackc/pgx in workload? I'm really curious if it is faster.

RaduBerinde · 2018-10-09T19:33:08Z

I tried using it as a driver for sql.Database (instead of lib/pq) and it was slower. I think we'd need to use their APIs directly to get better performance.

maddyblue · 2018-10-09T19:34:46Z

Yes, using their API directly is what I should have specified in my suggestion.

nvanbenschoten · 2018-10-10T05:21:44Z

I played around with it a while ago, using their API directly, and also found it to be slower. I was surprised but didn't put much time into trying to figure out what was causing the slowdown.

This commit gives ycsb the same treatment we gave kv in cockroachdb#30811 (4 years ago!). It removes a throughput bottleneck in the workload client by giving each ycsb worker goroutine its own `sql.Conn`, instead of having them all pull from a sufficiently sized `sql.DB` pool. Doing so avoids mutex contention. This commit also goes a step further than cockroachdb#30811 by creating multiple `sql.DB` objects and pulling only a bounded number of connections from each. This further avoids mutex contention and removes the next bottleneck. Without these changes, a ycsb driver on a 64 vCPU machine could push about 200k qps before hitting a ceiling. With it, I've seen the driver push upt to about 500k qps before CRDB itself became the bottlenck. Release justification: workload only.

86841: workload/ycsb: give each worker a separate sql.Conn r=nvanbenschoten a=nvanbenschoten This commit gives ycsb the same treatment we gave kv in #30811 (4 years ago!). It removes a throughput bottleneck in the workload client by giving each ycsb worker goroutine its own `sql.Conn`, instead of having them all pull from a sufficiently sized `sql.DB` pool. Doing so avoids mutex contention. This commit also goes a step further than #30811 by creating multiple `sql.DB` objects and pulling only a bounded number of connections from each. This further avoids mutex contention and removes the next bottleneck. Without these changes, a ycsb driver on a 64 vCPU machine could push about 200k qps before hitting a ceiling. With it, I've seen the driver push upt to about 500k qps before CRDB itself became the bottlenck. Release justification: workload only. Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>

This commit gives ycsb the same treatment we gave kv in #30811 (4 years ago!). It removes a throughput bottleneck in the workload client by giving each ycsb worker goroutine its own `sql.Conn`, instead of having them all pull from a sufficiently sized `sql.DB` pool. Doing so avoids mutex contention. This commit also goes a step further than #30811 by creating multiple `sql.DB` objects and pulling only a bounded number of connections from each. This further avoids mutex contention and removes the next bottleneck. Without these changes, a ycsb driver on a 64 vCPU machine could push about 200k qps before hitting a ceiling. With it, I've seen the driver push upt to about 500k qps before CRDB itself became the bottlenck. Release justification: workload only.

nvanbenschoten requested review from petermattis and m-schneider October 1, 2018 02:29

nvanbenschoten mentioned this pull request Oct 1, 2018

perf: kv falls off a cliff at ~12 nodes #26178

Closed

benesch approved these changes Oct 1, 2018

View reviewed changes

a-robinson approved these changes Oct 1, 2018

View reviewed changes

petermattis approved these changes Oct 1, 2018

View reviewed changes

nvanbenschoten force-pushed the nvanbenschoten/kvConn branch from d9038ba to 0883209 Compare October 1, 2018 11:06

nvanbenschoten force-pushed the nvanbenschoten/kvConn branch from 0883209 to b9ffacb Compare October 1, 2018 11:07

craig bot merged commit b9ffacb into cockroachdb:master Oct 1, 2018

nvanbenschoten mentioned this pull request Oct 2, 2018

workload: actually prepare writeStmt with Conn #30882

Merged

nvanbenschoten deleted the nvanbenschoten/kvConn branch October 4, 2018 18:43

RaduBerinde mentioned this pull request Oct 9, 2018

workload: change the way we issue SQL for TPCC [DNM] #31079

Closed

nvanbenschoten mentioned this pull request Aug 25, 2022

workload/ycsb: give each worker a separate sql.Conn #86841

Merged

blathers-crl bot mentioned this pull request Aug 29, 2022

release-21.2: workload/ycsb: give each worker a separate sql.Conn #87058

Merged

blathers-crl bot mentioned this pull request Aug 29, 2022

release-22.1: workload/ycsb: give each worker a separate sql.Conn #87059

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

workload: give each kvOp a separate sql.Conn #30811

workload: give each kvOp a separate sql.Conn #30811

nvanbenschoten commented Oct 1, 2018 •

edited

Loading

cockroach-teamcity commented Oct 1, 2018

benesch left a comment

nvanbenschoten commented Oct 1, 2018

a-robinson left a comment

tbg commented Oct 1, 2018

petermattis left a comment

nvanbenschoten commented Oct 1, 2018

craig bot commented Oct 1, 2018

RaduBerinde commented Oct 2, 2018

nvanbenschoten commented Oct 2, 2018

maddyblue commented Oct 9, 2018

RaduBerinde commented Oct 9, 2018

maddyblue commented Oct 9, 2018

nvanbenschoten commented Oct 10, 2018

workload: give each kvOp a separate sql.Conn #30811

workload: give each kvOp a separate sql.Conn #30811

Conversation

nvanbenschoten commented Oct 1, 2018 • edited Loading

cockroach-teamcity commented Oct 1, 2018

benesch left a comment

Choose a reason for hiding this comment

nvanbenschoten commented Oct 1, 2018

a-robinson left a comment

Choose a reason for hiding this comment

tbg commented Oct 1, 2018

petermattis left a comment

Choose a reason for hiding this comment

nvanbenschoten commented Oct 1, 2018

craig bot commented Oct 1, 2018

Build succeeded

RaduBerinde commented Oct 2, 2018

nvanbenschoten commented Oct 2, 2018

maddyblue commented Oct 9, 2018

RaduBerinde commented Oct 9, 2018

maddyblue commented Oct 9, 2018

nvanbenschoten commented Oct 10, 2018

nvanbenschoten commented Oct 1, 2018 •

edited

Loading