-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(bigquery/storage/managedwriter): more advanced routing #7587
Conversation
This PR allows the flowcontroller to report bytes in flight for flow controllers with a bounded byte definition. The primary connection load signals for a connection are the inserts/bytes in flight as reported by the flow controller, and this makes the bytes in flight a signal we can use. Important note: an unbounded flow controller will not report any bytes in flight. This avoids introducing odd situations due to size normalization where bytes tracked and the actual capacity of the semaphore could get out of sync. Towards: googleapis#7103
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will continue the review tomorrow
// only look for rebalance opportunies between different connections. | ||
for mostIdleIdx != leastIdleIdx { | ||
targetConn := sr.multiConns[leastIdleIdx] | ||
if targetConn.curLoad() < mostIdleLoad*connLoadDeltaThreshold { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't the targetConn's curLoad easy to be changed while doing rebalancing since the messages are being digested? Thus we may not currently checking the busiest connection in the ordered list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the entire system is dynamic. We're essentially taking a snapshot of the connections on an interval and potentially making slight adjustments when there's a sufficient difference in load to warrant such things.
The goal is to get an initial version we can profile and get feedback on, we're not married to a specific implementation.
In the course of benchmarking the original impl, observed undesired growth in watchdog time so I did end up introducing an inverse map from connection back to writers. |
This PR introduces a more advanced
sharedRouter
suitable for handling both exclusive and multiplex stream traffic. Overall this improves memory slightly (we don't have a pool-per-exclusive stream). We switch to using the new router for all real traffic, but some unit testing still leverages the simple router where we're not exercising routing. This change means a client will have at most 1 connection pool per region, rather than 1 per exclusive connection using the older router.The router maintains independent datastructures (and mutexes) for exclusive and multiplex connections. exclusive mappings are easy, they're always 1:1. For multiplex, we potentially adjust multiplex connection pool sizes when writers attach and detach, and we maintain mappings in both directions between connections and writers. There's also a watchdog goroutine that periodically curates the multiplex routing, looking for opportunities to grow and rebalance traffic, rather than trying to process this when trying to route an append to a connection.
Benchmarking exercises routing for both explicit stream and default stream writers, both using serial and concurrent strategies. We also benchmark watchdog timing to understand how expensive adjusting the pool may be, though it's inherently unstable so we force benchmarking to run with a fixed -benchtime.
Benchmark results here
Towards: #7103