Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default to excluding system ns's from injection #726

Merged
merged 3 commits into from
Oct 26, 2021

Conversation

lkysow
Copy link
Member

@lkysow lkysow commented Sep 15, 2021

kube-system is excluded because it's unlikely users will want to
provision Connect pods in that namespace and also because we don't want
to block pods being provisioned there if our webhook injector is down.

local-path-storage is excluded because this ns is used by kind to
provision PVCs and if ACLs are enabled then the install gets into a
deadlock where:

  • PVC can't be provisioned because Kind needs to create a Pod
  • Pod can't be created because injector webhook needs to be up
  • injector webhook can't come up until its got an ACL token
  • ACL token can't be provisioned because Consul server isn't up
  • Consul server can't be started because it doesn't have a PVC

NOTE: This matching is only supported in Kube 1.21+ where they've added
these labels to namespaces automatically now.

Should fix #715

How I've tested this PR:

  • Ran helm install on Kind using Kube 1.21 with ACLs and connect inject enabled. Saw that the servers never start. Then ran the same install with these changes. They do start.
  • acceptance tests pass

How I expect reviewers to test this PR:

  • code, can try it out if you'd like too

Checklist:

  • Tests added
  • CHANGELOG entry added

    HashiCorp engineers only, community PRs should not add a changelog entry.
    Entries should use present tense (e.g. Add support for...)

lawliet89 pushed a commit to lawliet89/consul-k8s that referenced this pull request Oct 6, 2021
@lkysow lkysow force-pushed the lkysow/kube-system-exclusion branch from 344428d to fd27556 Compare October 7, 2021 16:20
@lkysow lkysow marked this pull request as ready for review October 7, 2021 17:16
@lkysow lkysow force-pushed the lkysow/kube-system-exclusion branch from 7b8dd6e to d63d83b Compare October 7, 2021 18:29
@lkysow lkysow requested review from a team, ndhanushkodi and thisisnotashwin and removed request for a team October 7, 2021 18:29
@lkysow
Copy link
Member Author

lkysow commented Oct 7, 2021

I think the tests are failing because they're not using the actual chart changes in this PR.

kube-system is excluded because it's unlikely users will want to
provision Connect pods in that namespace and also because we don't want
to block pods being provisioned there if our webhook injector is down.

local-path-storage is excluded because this ns is used by kind to
provision PVCs and if ACLs are enabled then the install gets into a
deadlock where:
- PVC can't be provisioned because Kind needs to create a Pod
- Pod can't be created because injector webhook needs to be up
- injector webhook can't come up until its got an ACL token
- ACL token can't be provisioned because Consul server isn't up
- Consul server can't be started because it doesn't have a PVC

NOTE: This matching is only supported in Kube 1.21+ where they've added
these labels to namespaces automatically now.
@lkysow lkysow force-pushed the lkysow/kube-system-exclusion branch from d63d83b to c3ad2f5 Compare October 7, 2021 23:12
Copy link
Contributor

@thisisnotashwin thisisnotashwin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🦊

Copy link
Contributor

@ndhanushkodi ndhanushkodi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for missing this earlier. looks great, and the explanation for the deadlock was very helpful context.

@david-yu david-yu merged commit de51017 into main Oct 26, 2021
@david-yu david-yu deleted the lkysow/kube-system-exclusion branch October 26, 2021 21:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pods on EKS Nodes don't start after upgrade to 0.33
4 participants