Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance issues with overcommit webhook #646

Open
googs1025 opened this issue Jul 9, 2024 · 7 comments
Open

performance issues with overcommit webhook #646

googs1025 opened this issue Jul 9, 2024 · 7 comments

Comments

@googs1025
Copy link
Contributor

Will there be performance issues with the overcommit implementation of katalyst-core's internal webhook? If there are hundreds of nodes in the cluster.

@googs1025
Copy link
Contributor Author

/kind support

@WangZzzhe
Copy link
Collaborator

WangZzzhe commented Jul 9, 2024

The QPS of the webhook will increase with the increase of the number of nodes in the cluster. To address this issue, we have simplified the design of the webhook. In the current version, the overcommit webhook will not do any matching or calculation, and directly use the calculation results of the katalyst overcommit controller to update the node status. If there are a large number of nodes your cluster , it is recommended to gradually apply the overcommit for nodes and monitor the resource metrics of the webhook. @googs1025

@googs1025
Copy link
Contributor Author

@WangZzzhe
Thanks for your answer. I still see the calculation operation in the source code
refer to:
https://github.com/kubewharf/katalyst-core/blob/main/pkg/webhook/mutating/node/allocatable_mutator.go#L101

@googs1025
Copy link
Contributor Author

But I don't think this calculation will affect performance. What I care about is whether we took performance into consideration when designing.

@HaijunMa
Copy link

HaijunMa commented Jul 9, 2024

But I don't think this calculation will affect performance. What I care about is whether we took performance into consideration when designing.

What are the main aspects of the performance impact you are talking about?

@googs1025
Copy link
Contributor Author

But I don't think this calculation will affect performance. What I care about is whether we took performance into consideration when designing.

What are the main aspects of the performance impact you are talking about?

We may have hundreds of nodes in our cluster, and I am not sure if there will be any performance issues

@googs1025
Copy link
Contributor Author

Because hundreds of nodes need to report heartbeat information through this webhook

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants