-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
metrics: add initial support for SLI/SLO metrics #461
Conversation
a4d06b7
to
666670d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not clear on the deployment side:
If I understand correctly, the new metrics binary will be shipped in the same image as kata-monitor, but the image will be run twice: once for kata-monitor (with the daemonset defined by the operator's code), and a second time with the deployment files you're adding here?
I'm not sure about the benefit of using the same image in separate containers that do different things.
If it is the same image, why not run the two binaries in parallel from it?
If we can't / don't want to have both binaries running from the same place, why do we make a single image ?
666670d
to
58e5537
Compare
Yes. Currently kata-monitor binary is part of the "osc-monitor" image. And the idea is to reuse this image for "monitoring stuff". With both kata-monitor and the metrics server.
Even though they're in the same image, running them in separate containers makes it easier to manage and scale each one independently. This way, kata-monitor can continue running as a DaemonSet on every node, while the metrics server can be deployed separately as a Deployment, running only where needed. Running both in the same container is possible and I considered this option too... but it can get tricky with managing processes, restarts, and resource allocation. By using the same image for both but keeping them in their own containers, we get the best of both worlds: fewer images to manage (specially on the downstream configuration), but still clear separation between the two services. Wdyt? |
I actually don't like the idea of running both processes from the same container :-) I only mentioned it to understand where the benefit is. So I'm fine running them separately. Now I understand this is to be balanced with the complexity of building/publishing/maintaining separate images, so I don't want to enforce that. Just feeling this is weird to deploy two applications in a single pod. I can go with it if other reviewers are OK. |
The separation won’t force us to use two applications in a single pod. Each will run in its own pod: kata-monitor as a DaemonSet and the metrics server as a Deployment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running both in the same container is possible and I considered this option too... but it can get tricky with managing processes, restarts, and resource allocation. By using the same image for both but keeping them in their own containers, we get the best of both worlds: fewer images to manage (specially on the downstream configuration), but still clear separation between the two services. Wdyt?
I actually don't like the idea of running both processes from the same container :-) I only mentioned it to understand where the benefit is. So I'm fine running them separately. But I tend to think that each container image should contain the things it needs to run, and nothing more - the simpler, the better. If we run them separately, then I'd be in favor of using separate images.
Now I understand this is to be balanced with the complexity of building/publishing/maintaining separate images, so I don't want to enforce that. Just feeling this is weird to deploy two applications in a single pod.
I can go with it if other reviewers are OK.
I guess it is acceptable to ship some extra metric logic in the existing image if it eases the work.
IIUC this is targeting peer pods exclusively. Maybe worth to mention it in the commit message. |
5db0bcc
to
13ec837
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple of cosmetic changes left but rest LGTM.
Thanks @beraldoleal !
ce49b19
to
030ae01
Compare
Introduces high-level SLI metrics that can help derive SLOs for kata-remote. Currently, the osc-monitor image only includes the kata-monitor binary. The idea is to reuse the same image to also include the metrics-server. For now, the idea is to collect metrics for kata-remote, but we can expand later to get more metrics. Signed-off-by: Beraldo Leal <bleal@redhat.com>
030ae01
to
4f8386a
Compare
@beraldoleal: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks @beraldoleal
@gkurz can we merge this one? |
Introduces high-level SLI metrics that can help derive SLOs.
Currently, the osc-monitor image only includes the kata-monitor binary. The idea is to reuse the same image to also include the metrics-server.