workload/ycsb: give each worker a separate sql.Conn #86841

nvanbenschoten · 2022-08-25T02:49:40Z

This commit gives ycsb the same treatment we gave kv in #30811 (4 years
ago!). It removes a throughput bottleneck in the workload client by
giving each ycsb worker goroutine its own sql.Conn, instead of having
them all pull from a sufficiently sized sql.DB pool. Doing so avoids
mutex contention.

This commit also goes a step further than #30811 by creating multiple
sql.DB objects and pulling only a bounded number of connections from
each. This further avoids mutex contention and removes the next
bottleneck.

Without these changes, a ycsb driver on a 64 vCPU machine could push
about 200k qps before hitting a ceiling. With it, I've seen the driver
push upt to about 500k qps before CRDB itself became the bottlenck.

Release justification: workload only.

cockroach-teamcity · 2022-08-25T02:49:47Z

This change is

This commit gives ycsb the same treatment we gave kv in cockroachdb#30811 (4 years ago!). It removes a throughput bottleneck in the workload client by giving each ycsb worker goroutine its own `sql.Conn`, instead of having them all pull from a sufficiently sized `sql.DB` pool. Doing so avoids mutex contention. This commit also goes a step further than cockroachdb#30811 by creating multiple `sql.DB` objects and pulling only a bounded number of connections from each. This further avoids mutex contention and removes the next bottleneck. Without these changes, a ycsb driver on a 64 vCPU machine could push about 200k qps before hitting a ceiling. With it, I've seen the driver push upt to about 500k qps before CRDB itself became the bottlenck. Release justification: workload only.

erikgrinaker

Thanks!

Do you know if this could explain why YCSB/E has remained essentially flat, even when other benchmarks have shown dramatic regressions? We hypothesized that it was due to network bandwidth saturation, but it seems like this might be a contributing factor as well.

aayushshah15

Do you think it's worth re-running any benchmarks from #85993 with this patch?

Reviewable status: complete! 0 of 0 LGTMs obtained

nvanbenschoten · 2022-08-29T18:29:02Z

TFTRs!

bors r+

Do you know if this could explain why YCSB/E has remained essentially flat, even when other benchmarks have shown dramatic regressions?

We only started to see a client-side bottleneck due to mutex contention at about 250k qps, so I don't think this would explain the YCSB-E results unfortunately.

Do you think it's worth re-running any benchmarks from #85993 with this patch?

It's possible that this had an impact on YCSB-C and YCSB-D, because those are starting to breach the 100k qps mark. Still, I don't think it's worth spending the time to re-run given that we could push up to 250k qps before being entirely bottlenecked on the client.

craig · 2022-08-29T19:41:26Z

Build succeeded:

Bazel Essential CI (Cockroach)

nvanbenschoten requested review from aayushshah15 and a team August 25, 2022 02:49

nvanbenschoten added backport-21.2.x labels Aug 25, 2022

nvanbenschoten force-pushed the nvanbenschoten/ycsbSqlConn branch from 18de6b3 to 566bb16 Compare August 26, 2022 19:35

erikgrinaker approved these changes Aug 29, 2022

View reviewed changes

aayushshah15 approved these changes Aug 29, 2022

View reviewed changes

craig bot merged commit e48f99b into cockroachdb:master Aug 29, 2022

This was referenced Aug 29, 2022

release-21.2: workload/ycsb: give each worker a separate sql.Conn #87058

Merged

release-22.1: workload/ycsb: give each worker a separate sql.Conn #87059

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

workload/ycsb: give each worker a separate sql.Conn #86841

workload/ycsb: give each worker a separate sql.Conn #86841

nvanbenschoten commented Aug 25, 2022

cockroach-teamcity commented Aug 25, 2022

erikgrinaker left a comment •

edited

Loading

aayushshah15 left a comment

nvanbenschoten commented Aug 29, 2022

craig bot commented Aug 29, 2022

workload/ycsb: give each worker a separate sql.Conn #86841

workload/ycsb: give each worker a separate sql.Conn #86841

Conversation

nvanbenschoten commented Aug 25, 2022

cockroach-teamcity commented Aug 25, 2022

erikgrinaker left a comment • edited Loading

Choose a reason for hiding this comment

aayushshah15 left a comment

Choose a reason for hiding this comment

nvanbenschoten commented Aug 29, 2022

craig bot commented Aug 29, 2022

erikgrinaker left a comment •

edited

Loading