Adjust work queue rate limiter config #898
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Scale testing revealed excessive artificial delays from the bucket rate limiter used for the informer work queues. This was due to many failure requeues/retries occurring for many items in a short amount of time such that both the exponential and bucket rate limiters kicked in resulting in delays > 2 hours.
We can make some adjustments to the rate limiter config to alleviate excessive delays.
Increase the exponential rate limiter base delay from 5 to 50 ms. This reduces the number of re-queues before hitting the max delay from 13 to 10.
At 5 ms:
1: Delay: 5ms
2: Delay: 10ms
3: Delay: 20ms
4: Delay: 40ms
5: Delay: 80ms
6: Delay: 160ms
7: Delay: 320ms
8: Delay: 640ms
9: Delay: 1.28s
10: Delay: 2.56s
11: Delay: 5.12s
12: Delay: 10.24s
13: Delay: 20.48s
14: Delay: 30s
At 50 ms:
1: Delay: 50ms
2: Delay: 100ms
3: Delay: 200ms
4: Delay: 400ms
5: Delay: 800ms
6: Delay: 1.6s
7: Delay: 3.2s
8: Delay: 6.4s
9: Delay: 12.8s
10: Delay: 25.6s
11: Delay: 30s
Increase the bucket rate limiter burst size from 100 to 500. This is the number of items at which the limiter will start adding delays if the rate is exceeded.
Cap the maximum delay at 5 min.