refactor(bigquery/storage/managedwriter): more advanced routing #7587

shollyman · 2023-03-18T04:38:09Z

This PR introduces a more advanced sharedRouter suitable for handling both exclusive and multiplex stream traffic. Overall this improves memory slightly (we don't have a pool-per-exclusive stream). We switch to using the new router for all real traffic, but some unit testing still leverages the simple router where we're not exercising routing. This change means a client will have at most 1 connection pool per region, rather than 1 per exclusive connection using the older router.

The router maintains independent datastructures (and mutexes) for exclusive and multiplex connections. exclusive mappings are easy, they're always 1:1. For multiplex, we potentially adjust multiplex connection pool sizes when writers attach and detach, and we maintain mappings in both directions between connections and writers. There's also a watchdog goroutine that periodically curates the multiplex routing, looking for opportunities to grow and rebalance traffic, rather than trying to process this when trying to route an append to a connection.

Benchmarking exercises routing for both explicit stream and default stream writers, both using serial and concurrent strategies. We also benchmark watchdog timing to understand how expensive adjusting the pool may be, though it's inherently unstable so we force benchmarking to run with a fixed -benchtime.

Benchmark results here

Towards: #7103

This PR allows the flowcontroller to report bytes in flight for flow controllers with a bounded byte definition. The primary connection load signals for a connection are the inserts/bytes in flight as reported by the flow controller, and this makes the bytes in flight a signal we can use. Important note: an unbounded flow controller will not report any bytes in flight. This avoids introducing odd situations due to size normalization where bytes tracked and the actual capacity of the semaphore could get out of sync. Towards: googleapis#7103

bigquery/storage/managedwriter/connection.go

bigquery/storage/managedwriter/client.go

bigquery/storage/managedwriter/routers_test.go

bigquery/storage/managedwriter/routers.go

bigquery/storage/managedwriter/routers_test.go

bigquery/storage/managedwriter/options.go

GaoleMeng

I will continue the review tomorrow

bigquery/storage/managedwriter/connection.go

bigquery/storage/managedwriter/routers.go

GaoleMeng · 2023-03-28T06:20:08Z

bigquery/storage/managedwriter/routers.go

+	// only look for rebalance opportunies between different connections.
+	for mostIdleIdx != leastIdleIdx {
+		targetConn := sr.multiConns[leastIdleIdx]
+		if targetConn.curLoad() < mostIdleLoad*connLoadDeltaThreshold {


isn't the targetConn's curLoad easy to be changed while doing rebalancing since the messages are being digested? Thus we may not currently checking the busiest connection in the ordered list?

Yes, the entire system is dynamic. We're essentially taking a snapshot of the connections on an interval and potentially making slight adjustments when there's a sufficient difference in load to warrant such things.

The goal is to get an initial version we can profile and get feedback on, we're not married to a specific implementation.

shollyman · 2023-03-29T16:37:25Z

In the course of benchmarking the original impl, observed undesired growth in watchdog time so I did end up introducing an inverse map from connection back to writers.

bigquery/storage/managedwriter/routers.go

shollyman added 3 commits March 17, 2023 16:52

refactor(bigquery/storage/managedwriter): improve router

58812de

cleanup client setup, tests

a036b42

product-auto-label bot added size: l Pull request size is large. api: bigquery Issues related to the BigQuery API. labels Mar 18, 2023

shollyman added 9 commits March 18, 2023 04:38

Merge branch 'main' into loadreporting

a4027d4

more multiplex handling

7ec4ec9

revist closure

4e6a9de

docs, tests

d9ac9cd

Merge branch 'main' into loadreporting

9f3d956

Merge branch 'main' into loadreporting

248bae9

add routing benchmark

aad14ee

more load awareness and rebalancing

e213052

Merge branch 'main' into loadreporting

b2905bd

product-auto-label bot added size: xl Pull request size is extra large. and removed size: l Pull request size is large. labels Mar 22, 2023

linting and comments

f71c04c

shollyman marked this pull request as ready for review March 22, 2023 20:29

shollyman requested review from a team as code owners March 22, 2023 20:29

shollyman requested review from tswast, alvarowolfx and yirutang and removed request for tswast March 22, 2023 20:29

remove dead reference

031a09b

yirutang requested a review from GaoleMeng March 22, 2023 21:32

shollyman added 3 commits March 23, 2023 15:33

move on from special case

869d384

Merge branch 'main' into loadreporting

dbebbb6

docs pass on router

9188d1d

GaoleMeng reviewed Mar 24, 2023

View reviewed changes

reviewer feedback

796a004

alvarowolfx reviewed Mar 24, 2023

View reviewed changes

shollyman added 7 commits March 24, 2023 22:25

augment benchmarking

06fd651

address reviewer feedback

9591ba2

linting

fe63f64

linting

cb783e1

Merge branch 'main' into loadreporting

fe8e627

Merge branch 'main' into loadreporting

0c11155

Merge branch 'main' into loadreporting

1259578

GaoleMeng reviewed Mar 28, 2023

View reviewed changes

shollyman added 4 commits March 28, 2023 20:31

remove lastWrite logic

737e5e3

Merge branch 'main' into loadreporting

3406939

refactor watchdog and benchmark

ffd8d7d

more benchmark work

00b6bbe

GaoleMeng approved these changes Mar 30, 2023

View reviewed changes

bigquery/storage/managedwriter/routers.go Show resolved Hide resolved

shollyman added 3 commits March 30, 2023 20:03

Merge branch 'main' into loadreporting

5d43ca8

update watchdog interval

73c5573

linting

9d556ad

shollyman merged commit c24f0a1 into googleapis:main Mar 30, 2023

shollyman deleted the loadreporting branch March 30, 2023 21:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(bigquery/storage/managedwriter): more advanced routing #7587

refactor(bigquery/storage/managedwriter): more advanced routing #7587

shollyman commented Mar 18, 2023 •

edited

Loading

GaoleMeng left a comment

GaoleMeng Mar 28, 2023

shollyman Mar 28, 2023

shollyman commented Mar 29, 2023

refactor(bigquery/storage/managedwriter): more advanced routing #7587

refactor(bigquery/storage/managedwriter): more advanced routing #7587

Conversation

shollyman commented Mar 18, 2023 • edited Loading

GaoleMeng left a comment

Choose a reason for hiding this comment

GaoleMeng Mar 28, 2023

Choose a reason for hiding this comment

shollyman Mar 28, 2023

Choose a reason for hiding this comment

shollyman commented Mar 29, 2023

shollyman commented Mar 18, 2023 •

edited

Loading