Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dns: Add configurable-dns-pod-placement enhancement #663

Conversation

Miciah
Copy link
Contributor

@Miciah Miciah commented Feb 23, 2021

This enhancement enables cluster administrators to configure the placement of the CoreDNS Pods that provide cluster DNS service.

@Miciah Miciah force-pushed the dns-add-configurable-dns-pod-placement-enhancement branch from e2418d0 to 1dea562 Compare February 23, 2021 18:56

// nodePlacement enables explicit control over the scheduling of DNS pods.
//
// If unset, defaults are used. See nodePlacement for more details.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// If unset, defaults are used. See nodePlacement for more details.
// If unset, defaults are used. See dnsNodePlacement for more details.

Is this what you meant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think nodePlacement is right; the godoc should reference the field by its name, which is nodePlacement.

Copy link
Contributor

@sgreene570 sgreene570 Feb 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to say "refer to x for more info on x" though?
Maybe i'm overthinking this. https://github.com/openshift/api/blob/master/operator/v1/types_ingress.go#L134 uses the type rather than field name (ah well actually thats ambiguous!)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either way works for me. Not worth bike shedding over this.

By default, DNS Pods run on untainted Linux nodes. The `NodePlacement` field
enables cluster administrators to specify alternative parameters. For example,
the following DNS specifies that DNS Pods should run only on "infra" nodes
(i.e., nodes that have the "node-role.kubernetes.io/infra" label):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth mentioning that this example assumes infra nodes have no infra node taints.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I amended the example to include a toleration for any "node-role.kubernetes.io/infra" NoSchedule taint. I don't know how prevalent the use of this taint is, but I gather that it is something that cluster administrators who configure infra nodes often use, and adding the toleration shouldn't cause any problems if the taint doesn't exist, so the example might as well include it.

@Miciah Miciah force-pushed the dns-add-configurable-dns-pod-placement-enhancement branch 2 times, most recently from 922c81f to 76c4ff3 Compare March 22, 2021 02:33
@Miciah Miciah force-pushed the dns-add-configurable-dns-pod-placement-enhancement branch from 76c4ff3 to 8976fdf Compare March 29, 2021 18:01
@Miciah Miciah force-pushed the dns-add-configurable-dns-pod-placement-enhancement branch from 8976fdf to d24f542 Compare March 29, 2021 18:06
@sgreene570
Copy link
Contributor

/retest
/lgtm
/hold for others to review

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 29, 2021
@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Mar 29, 2021

### User Stories

#### As a cluster administrator, I must comply with a security policy that prohibits communication among worker nodes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is solving this use case with explicit placement more appropriate than using service topology? This is a pretty specific constraint (and doesn’t cover other node to node traffic like monitoring, sdn, control plane traffic, etc). While constraining DNS to nodes is reasonable on its own right for administrator topology choices, this specific example is fairly vague / not particularly complete.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Service topology is alpha and deprecated, to be superseded by internal traffic policy, which is alpha (and not yet implemented AFAIK). Users have been strenuously complaining about this for a while—some security policies require prohibiting traffic among worker nodes, so DNS needs to run on non-worker nodes only.

such a way that DNS Pods could not be scheduled to any node, rendering the DNS
service unavailable. Because the DNS service is critical to other cluster
components including OAuth, fixing misconfigured DNS Pod placement parameters
could be impossible for the cluster administrator to do.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this only scenario that could block fixing this the user being unable to login? I’m skeptical that a one time check for placement is the right approach, vs (for example) always ensuring dns runs on the control plane nodes if there are no valid worker nodes in use (or separately as a fallback). Emulating the scheduler is probably not a great idea (vs detecting zero pods at runtime or always running the previous topology until the new topology is in place, or “failing open” if no pods are detected). Generally doing any sort of static detection of schedulability adds fragility to a system.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, if DNS isn't available to resolve kubernetes.default.svc.cluster.local, then presumably that could break OAuth. If the administrator didn't already have an OAuth token handy, the problem might then be impossible to resolve. If the user did have an OAuth token, it should be possible to patch the CRD to revert the problematic node-placement parameters. I agree that trying to infer what the scheduler will do is an imperfect mitigation. We could add some logic to the reconcile loop to detect if the operator has been reporting degraded for >n minutes and has 0 DNS pods; would it be valid for the operator to revert spec.nodePlacement, or does the operator need to maintain some state to track that the current value in spec.nodePlacement resulted in having 0 pods and needs to be ignored?

@knobunc
Copy link
Contributor

knobunc commented Apr 9, 2021

/approve

@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: knobunc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 9, 2021
@knobunc
Copy link
Contributor

knobunc commented Apr 9, 2021

/lgtm

@knobunc
Copy link
Contributor

knobunc commented Apr 9, 2021

/hold cancel

We can address any other concerns in follow-up PRs.

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 9, 2021
@openshift-merge-robot openshift-merge-robot merged commit b2dc667 into openshift:master Apr 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants