-
Notifications
You must be signed in to change notification settings - Fork 74
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: yolocs The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
pkg/broker/handler/providers.go
Outdated
@@ -37,6 +38,11 @@ var ( | |||
|
|||
DefaultHTTPClient = &http.Client{ | |||
Transport: &ochttp.Transport{ | |||
Base: &http.Transport{ | |||
MaxIdleConns: 1000, | |||
MaxIdleConnsPerHost: 1000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since generally we should have more than 1 targets, I think this value should be smaller than MaxIdleConns
.
I suggest the following numbers. @grantr any thoughts on this?
MaxIdelConns: 1000,
// Assume an average of 20 triggers, just a guess. This makes sure idle conns are more fairly distributed across targets.
MaxIdleConnsPerHost: 50,
// Also set max conns per target to limit the resource usage per target. This should also help limit the memory usage.
MaxConnsPerHost:100,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Open to suggestion. Some of my thinking:
-
We don't have to guarantee fairness here. When you have 100 triggers but only one is actively receiving events, I see no reason to not have it take more idle connections. It should naturally evolve to "active triggers have more idle conns and less active ones have less". Although it might worth setting
MaxIdleConnsPerHost
a little bit less thanMaxIdleConns
. Under high concurrency I occasionally saw errors of failure to put new idle conn for a host because it's reaching max idle conns. -
Setting
MaxConnsPerHost
should be able to effectively limit resource usage. But there is a balance between this and maxing throughput. Especially when we don't have "queue length based" auto scaling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RE
#1. We need to find a good balance point. A high MaxIdleConnsPerHost
deals with "small number of active triggers" scenario well but not so well with many active triggers. So I took a guess of on average 20 active triggers. That might be too big. Ideally we can make this configurable but we have to pick a guess for now. How about setting MaxIdleConnsPerHost
to 100 or 200?
#2 Agree it's a tradeoff. Since we can run into OOM, setting this will help. Maybe something like 200 is a good balance point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking a bigger number like 500 for both MaxIdleConnsPerHost
and MaxConnsPerHost
. Since now I set high memory limit with HPA on 50%, it should help with the OOM problem (not always but should help some). Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it better to have MaxIdleConnsPerHost< MaxConnsPerHost? This gives more room for other triggers to maintain some idle connection. The tradeoff is that it's less efficient for handling surges.
I don't think we can come up with perfect numbers We may need to tune it later anyway. I will lgtm with whatever you come up with :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense. Set them to 500 now.
/lgtm |
Some mitigation for #1265
Proposed Changes
MaxIdleConns
for delivery HTTP client (would help reusing connections and reduce the chance of dns error)Release Note
Docs