-
-
Notifications
You must be signed in to change notification settings - Fork 423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve fairness on stream transmissions #949
Conversation
The current version of send multiplexing tries to send as much data from a single stream as possible before switching over to the next stream. The only situation where data from another stream than the original one is sent is if there are no peer flow control credits available anymore, or if the application didn't feed new data. If none of this situations occur, other streams will be starved. This is typically not what people expect, and can lead to high latencies on streams which are queued behind earlier streams. This change makes the send behavior "fairer" by enqueuing stream which just have sent data **after** all other pending streams instead of in front of them. This is simply achieved by switching to a `VecDeque` and using the the right insert/remove orders. Stats (obtained via quinn-rs#946), when running ``` ./bulk --streams 300 --max_streams 10 --stream_size 10 ``` ``` Stream metrics: AVG : Throughput: 172.38 MiB/s, Duration: 395.00ms P0 : Throughput: 0.83 MiB/s, Duration: 39.00ms P10 : Throughput: 49.19 MiB/s, Duration: 40.00ms P50 : Throughput: 232.50 MiB/s, Duration: 42.00ms P90 : Throughput: 245.87 MiB/s, Duration: 203.00ms P100: Throughput: 251.25 MiB/s, Duration: 12.08s ``` ``` Stream metrics: AVG : Throughput: 24.20 MiB/s, Duration: 413.00ms P0 : Throughput: 21.42 MiB/s, Duration: 279.00ms P10 : Throughput: 23.25 MiB/s, Duration: 403.00ms P50 : Throughput: 24.34 MiB/s, Duration: 410.00ms P90 : Throughput: 24.81 MiB/s, Duration: 430.00ms P100: Throughput: 35.78 MiB/s, Duration: 466.00ms ``` With this change throughput on streams is a lot more consistent than before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome improvement! I remember we did some refactoring to make this possible a while back, but somehow never got around to actually switching the behavior...
Changes look good to me. I'm very confused about the stats posted here... Is this before and after? And if so, does the lack of fairness really causing a 300x differential in throughput between P0 and P100? I'm aware queueing theory can have some non-intuitive effects, but this seems larger than expected. |
Yes, this is what happens. The old behavior effectivily had LIFO scheduling behavior. the server would only work on the latest request. All others are not worked on. In this benchmark we send M requests at once. The server will work on one of those. If it's done the client will send a new request, and the server will directly start working on this. This repeats, until the M-1 initial requests are left. Those are subject to extremely poor latency, while everything else gets full speed. And now since those were only 9 requests out of 300 in the example above - even the P10 metric will still look good. The P03 metric and below would start to show the poor performance. Btw: Those behaviors are also called out in https://qlog.edm.uhasselt.be/epiq/files/QUICandH3ImplementationDiversity_Marx_REVIEW9may2020.pdf , which I found yesterday. It mentions that a few implementations use LIFO and that it's not optimal. It however doesn't explicitely tell which ones. I guess Quinn was among it. |
The old behavior was also very sensitive to nondeterminism in the tokio scheduler, for extra unpredictable fun. These issues were called out to us a few times in the implementers' slack when people were analyzing interop behavior with qvis. |
The current version of send multiplexing tries to send as much data from
a single stream as possible before switching over to the next stream.
The only situation where data from another stream than the original one
is sent is if there are no peer flow control credits available anymore,
or if the application didn't feed new data.
If none of this situations occur, other streams will be starved.
This is typically not what people expect, and can lead to high latencies
on streams which are queued behind earlier streams.
This change makes the send behavior "fairer" by enqueuing stream which
just have sent data after all other pending streams instead of in
front of them. This is simply achieved by switching to a
VecDeque
andusing the the right insert/remove orders.
Stats (obtained via #946), when running
Before:
After:
With this change throughput on streams is a lot more consistent than
before.