Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for PodDisruptionBudgets #2136

Closed
JorTurFer opened this issue Sep 20, 2023 · 17 comments · Fixed by #2141
Closed

Add support for PodDisruptionBudgets #2136

JorTurFer opened this issue Sep 20, 2023 · 17 comments · Fixed by #2141
Assignees
Labels
enhancement New feature or request

Comments

@JorTurFer
Copy link
Contributor

Although sidecar and daemonset don't allow pdb, deployment and statefulset modes are workload types where pdbs make sense totally, as they are centralized workloads.
Currently, we are using the collector helm chart because it supports pdbs but we'd like to use the operator for using Target Allocator, but the pdbs are a requirement for us.

I think that adding support to pdbs is a good addition to the operator and I'm willing to contribute with the feature if you also agree with having it

@pavolloffay
Copy link
Member

thanks for the request @JorTurFer

Do you want to use PodDisruptionBudgets for target allocator as well?

@pavolloffay pavolloffay added area:controller enhancement New feature or request labels Sep 20, 2023
@JorTurFer
Copy link
Contributor Author

I haven't tested properly the target allocator yet, but if it's something that requires HA, yes. I mean, I know that we can use affinities to prevent some scenarios, but we have to ensure that we have at least X instances of the collector working. That's why we needed the pdbs

@JorTurFer
Copy link
Contributor Author

Do you see the request useful? I'm willing to implement (but obviously, I prefer to ask before rather than drafting a PR that never is merged xD)

@pavolloffay
Copy link
Member

I am not well familiar with the concept perhaps @jaronoff97 ?

I found this

From version 1.15 PDBs support custom controllers where the scale subresource is enabled.

https://kubernetes.io/docs/tasks/run-application/configure-pdb/#identify-an-application-to-protect

otelcol CRD implements the scale subresource

@JorTurFer
Copy link
Contributor Author

No no, I want to deploy a pdb for the generated deployment / statefulset. I mean, with this manifest:

apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: sidecar-for-my-app
spec:
  mode: deployment

The operator generates a deployment for the collector. I want to generate also a PDB for that generated deployment, not for the CRD itself

@jaronoff97
Copy link
Contributor

Hey @JorTurFer thanks for getting this going here :) Is there a reason we would do this in the operator and not just let users apply this themselves (that's what i do currently)? What pavol mentions is accurate in that if we were to do this in the operator we would want to do it the same way as how the hpa created by the operator works i.e. by targeting the scale subresource.

@JorTurFer
Copy link
Contributor Author

JorTurFer commented Sep 20, 2023

oooh! got it. Actually that's better 😄

The reason for adding the pdb is that you can define all the resources within OpenTelemetryCollector manifests. I mean, why don't let users to apply the ingress themselves? or the HPA? As a user, I can define all the aspects of the collector (configs, volumes,sa, services, ingress, hpa) except the pdb, which is important for the reliability of the system. As I said, I could use anti-affinity for ensuring that there isn't 2 collector instances on the same node for being safe on node disruptions (such as upgrades, scaling in, etc ) but the pod disruption budget is already for that (it's an api explicitly created for it indeed), so defining it in the same place where I define all the other stuff sounds worth (at least in my mind).

@jaronoff97
Copy link
Contributor

yeah makes sense! I think as long as we let it work with the collector in the same way as the HPA that would be great. I definitely would accept it!

@JorTurFer
Copy link
Contributor Author

JorTurFer commented Sep 20, 2023

Nice!
Let me take a look and give a try. You can assign the issue to me if you want :)
I'll draft a PR and we can discuss the implementation over the code 😄

@jaronoff97
Copy link
Contributor

sounds great, thanks for taking this on 🙇

@JorTurFer
Copy link
Contributor Author

I think that I have already something to review :)

@cmergenthaler
Copy link
Contributor

Hey @JorTurFer, thanks for adding the feature! Is there a reason why the PDB hasn't been added for Target Allocator as well?

@JorTurFer
Copy link
Contributor Author

As I understood the TargetAllocator isn't critical and doesn't require HA (but maybe I'm wrong and we have to add it there too)
I mean, it does not process real time traffic but configure the collector, so 5-10 seconds down are not critical

@cmergenthaler
Copy link
Contributor

Thanks for the explanation. From what I've understood, it is recommended to run at least 2 replicas of TA, so that it is also HA, see #2159 (comment)
So I think having pdb in place would be valid as well

@jaronoff97
Copy link
Contributor

@cmergenthaler i think that's a valid concern – would you mind opening up another issue? I think there's reason here to do some more work to pipe these things better for user concerns. I'll add a discussion topic for our SIG meeting this week.

@cmergenthaler
Copy link
Contributor

@jaronoff97 done: #2261

@JorTurFer
Copy link
Contributor Author

Oh, I misunderstood the TA I guess.
Let's continue the discussion on the issue :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants