Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove access to cluster from anonymous users #11016

Conversation

nicolas-goudry
Copy link
Contributor

@nicolas-goudry nicolas-goudry commented Mar 15, 2024

What type of PR is this?

/kind feature

What this PR does / why we need it:

For context, the current way Kubespray is dealing with kubeadm when adding a new node to the cluster is to rely on the cluster-info configmap. While this is working fine, it requires a role bound to anonymous users. This is described in the official documentation.

This PR adds an opt-in option remove_anonymous_access allowing users to get rid of the rolebinding linked to anonymous users, while ensuring kubeadm will continue to work fine by leveraging the file discovery feature.

When this option is true, the join process will not rely on reading the cluster-info configmap as an anonymous user. Instead the configmap will be read from the first control plane node, stored in a variable and wrote on the target node. The JoinConfiguration will then use the discovery.file.kubeConfigPath option instead of discovery.bootstrapToken.

Which issue(s) this PR fixes:

None.

Special notes for your reviewer:

This is a proposal, please challenge it, as well as its implementation (which is very naive and surely error prone).

If this proposal is accepted, I’ll add the relevant documentation.

Does this PR introduce a user-facing change?:

Add new option `remove_anonymous_access` to prevent granting RBAC permissions to anonymous users.

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Mar 15, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @nicolas-goudry. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 15, 2024
@nicolas-goudry
Copy link
Contributor Author

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 15, 2024
@MrFreezeex
Copy link
Member

Hi, thanks for your contribution! This looks great! The CI failure seems somewhat related though, so it seems to break a normal install right now :/.

To accept this I would like to have one of the e2e scenario where this new feature would be enabled and that kube_api_anonymous_auth is set to false (btw I think this new feature could be auto enabled if kube_api_anonymous_auth is set to false, WDYT?).

Also small nit but looking at the other variables they are all prefixed by kubeadm_ so I think it would make more sense to do the same here.

@nicolas-goudry
Copy link
Contributor Author

Hi, thanks for your contribution! This looks great! The CI failure seems somewhat related though, so it seems to break a normal install right now :/.

I’ll take a look at this ASAP.

To accept this I would like to have one of the e2e scenario where this new feature would be enabled and that kube_api_anonymous_auth is set to false (btw I think this new feature could be auto enabled if kube_api_anonymous_auth is set to false, WDYT?).

I’ll have to check how the e2e scenarios are handled, is there any documentation on this? I looked at the repository files and couldn’t find where the tests are located, would be great if you could point me toward them 🙂

As for enabling this option along with kube_api_anonymous_auth, I would be ok with it, but I’m wondering how the kubeadm joining process is working when this option is enabled since in the documentation it is stated that any unauthenticated request will result in a 401 response… I’ll have to perform some tests with this option as well.

Also small nit but looking at the other variables they are all prefixed by kubeadm_ so I think it would make more sense to do the same here.

Fair point, I’ll rename it to kubeadm_use_file_discovery.

@MrFreezeex
Copy link
Member

MrFreezeex commented Mar 18, 2024

To accept this I would like to have one of the e2e scenario where this new feature would be enabled and that kube_api_anonymous_auth is set to false (btw I think this new feature could be auto enabled if kube_api_anonymous_auth is set to false, WDYT?).

I’ll have to check how the e2e scenarios are handled, is there any documentation on this? I looked at the repository files and couldn’t find where the tests are located, would be great if you could point me toward them 🙂

Sure, so the ansible variables are passed in tests/files I would say to pick one test that correspond to the part2 of the pipeline

As for enabling this option along with kube_api_anonymous_auth, I would be ok with it, but I’m wondering how the kubeadm joining process is working when this option is enabled since in the documentation it is stated that any unauthenticated request will result in a 401 response… I’ll have to perform some tests with this option as well.

Ah! I assumed you were interested to allow disabling anonymous access entirely (fyi this option has been broken for years at this point). Not entirely sure about "only" removing kubeadm:bootstrap-signer-clusterinfo since kubeadm add this "by himself". AFAIK fetching the cluster info configmap was the main blocker for disabling anonymous auth entirely last time I checked but maybe after that it would fall to another issue that we don't know yet

@nicolas-goudry
Copy link
Contributor Author

Sure, so the ansible variables are passed in tests/files I would say to pick one test that correspond to the the part2 of the pipeline

Nice, I’ll give it a look.

Ah! I assumed you were interested to allow disabling anonymous access entirely (fyi this option has been broken for years at this point). Not entirely sure about "only" removing kubeadm:bootstrap-signer-clusterinfo since kubeadm add this "by himself". AFAIK fetching the cluster info configmap was the main blocker for disabling anonymous auth entirely last time I checked but maybe after that it would fall to another issue that we don't know yet

Maybe this is the way to go. I mean: fixing the kube_api_anonymous_auth value would supersede the purpose of this PR!

I’ll let you know how things go.

@nicolas-goudry
Copy link
Contributor Author

nicolas-goudry commented Mar 18, 2024

Here are the detailed step I performed in order to be able to install a cluster with the kube_api_anonymous_auth option set to false. TLDR: it didn’t work.

The first issue was that upon cluster initialization, the secondary control plane nodes failed to join the cluster since kubeadm relies (as explained in the initial PR description) on the anonymous authentication system (which is disabled in this case) to get and validate the cluster information.

Since the API server is not ready yet at this stage, we can’t rely on kubectl to get the cluster-info configmap from the kube-public namespace. However, we can use kubeadm to generate a kubeconfig file from the first control plane node and provide it to the secondary control plane nodes so that they can use it for the kubeadm file discovery join process. I tested this solution and it worked fine.

Then, another issue arose. This task tries to reach the /healthz endpoint of the API server via HTTP before going further with the installation. However, since the task is using the builtin uri Ansible module without any form of authentication, the request ends up failing after all 60 retries with a 401 HTTP error (which is expected, since anonymous authentication is disabled). This was easily fixable by using kubectl with the --raw flag to call the /healthz endpoint.

Lastly, the files changed in this PR had to be tweaked a bit so that the remaining nodes could join the cluster using the file discovery join process. Since the API server is ready at this point, we can rely on kubectl to get the cluster-info configmap from the kube-public namespace.


The main issue now is that the API server can't probe itself since it’s trying to call itself on the /livez (liveness + startup probe) and /readyz (readiness probe) using httpGet probes that ends being denied with a… 401 HTTP error. The result of this is that the API server is getting killed by the kubelet every now and then, making the whole cluster unstable. It seems this is a known limitation.

You can see the involved changes here.


IMO, having the possibility to somewhat “secure” the cluster by removing any clusterrole/role associated to anonymous users is at least a first (small) step in the right direction. WDYT?

@nicolas-goudry nicolas-goudry force-pushed the add-optional-kubeadm-file-discovery branch from ebfa2b1 to 1c2ee7d Compare March 18, 2024 21:12
@MrFreezeex
Copy link
Member

MrFreezeex commented Mar 18, 2024

Here are the detailed step I performed in order to be able to install a cluster with the kube_api_anonymous_auth option set to false. TLDR: it didn’t work.

Thanks you, it was very insightful 🙏. Now we now exactly why we can't support that as upstream said "you are on your own" 😅.

IMO, having the possibility to somewhat “secure” the cluster by removing any clusterrole/role associated to anonymous users is at least a first (small) step in the right direction. WDYT?

Sure, just a small followup question, is that all that is required to remove the clusterrole? Meaning that kubeadm won't create it with your patch as is, or do you need to create another task that does kubectl delete to remove it?

@nicolas-goudry
Copy link
Contributor Author

I didn’t find any way to prevent kubeadm from creating the role. AFAICT, it seems to be a “hardcoded” part of its workflow.

Indeed, we would have to create another task to remove the role ourselves. I need to perform more tests in order to make sure that nothing would be broken by this (ie. scale, upgrade).

If we decide to implement this, don’t you think we should have a more user-friendly option? Something like remove_anon_access that would in fact do two things:

  • enable kubeadm file discovery
  • remove the role linked to anonymous users

@MrFreezeex
Copy link
Member

MrFreezeex commented Mar 18, 2024

If we decide to implement this, don’t you think we should have a more user-friendly option? Something like remove_anon_access that would in fact do two things:

* enable kubeadm file discovery

* remove the role linked to anonymous users

Hmm I don't think it would be that meaningful to do one of those two things without the other so I guess that seems quite sensible, not sure about the naming though. it may depends in what role you would be doing this as well, let's see. Another option would be to have two variables and one of those two would default to the other one (i.e. remove_anon_access defaulting to kubeadm_use_file_discovery), but both ways looks ok to me.

@nicolas-goudry nicolas-goudry force-pushed the add-optional-kubeadm-file-discovery branch from 1c2ee7d to ca12384 Compare March 19, 2024 20:22
@nicolas-goudry
Copy link
Contributor Author

I checked the kubeadm code and found that the rolebinding is created in two cases:

  • init phase for bootstrap tokens
  • cluster upgrade

So I decided to implement this new feature by introducing a user-facing variable remove_anonymous_access which defaults to false. This variable is set in the sample inventory as well as in the kubespray-defaults role.

This variable controls the removal of the kubeadm:bootstrap-signer-clusterinfo rolebinding linked to anonymous users after the cluster initialization (control-plane role => kubeadm-setup.yml play) as well as after the cluster upgrade (control-plane role => kubeadm-upgrade.yml play).

Since at this point anonymous users don’t have permission to read the cluster-info configmap anymore, I configured the kubeadm-secondary.yml play of the control-plane role (handling control plane nodes other than the first one) as well as the main.yml play of the kubeadm role (handling non control plane nodes) to use kubeadm’s file discovery for the join process. This is controlled by the kubeadm_use_file_discovery variable which defaults to the value of remove_anonymous_access.

I tested these changes by performing a cluster installation, followed by a graceful upgrade of said cluster as well as a scale up by adding a new node. Please let me know if you wish me to perform more tests.

I also took the liberty to add a warning comment above the kube_api_anonymous_auth variable linking to my previous comment for further details about why it can’t be set to false.

One thing to note is that currently the kubeconfig used for kubeadm’s file discovery join process is left on all hosts except the first control plane node, does it matter? Should I remove them from the nodes after they join the cluster?

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 19, 2024
@nicolas-goudry
Copy link
Contributor Author

/unhold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 19, 2024
@nicolas-goudry nicolas-goudry changed the title PROPOSAL - Use kubeadm file discovery Remove access to cluster from anonymous users Mar 19, 2024
@nicolas-goudry nicolas-goudry marked this pull request as draft March 19, 2024 20:34
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 19, 2024
Copy link
Member

@MrFreezeex MrFreezeex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work! 🙏

Couple of nits, nothing crucial, I would like to see remove_anonymous_access activated into an existing e2e tests (possibly an upgrade one if you think that's relevant) though. But other than that LGTM!

@MrFreezeex
Copy link
Member

One thing to note is that currently the kubeconfig used for kubeadm’s file discovery join process is left on all hosts except the first control plane node, does it matter? Should I remove them from the nodes after they join the cluster?

Well the kubeconfig is taken from a cluster-info that was previously public and those same info would also exists on similar files in the config dir so I don't think it truly matter to remove it. Even removing it could cause harm to debugging sessions when one would like to launch kubeadm outside of kubespray.

@nicolas-goudry
Copy link
Contributor Author

I would like to see remove_anonymous_access activated into an existing e2e tests

I’ll work on this tomorrow and remove the draft state once it’s done 🙂

Well the kubeconfig is taken from a cluster-info that was previously public and those same info would also exists on similar files in the config dir so I don't think it truly matter to remove it. Even removing it could cause harm to debugging sessions when one would like to launch kubeadm outside of kubespray.

Fair point! I’ll keep this as-is then.

@nicolas-goudry nicolas-goudry force-pushed the add-optional-kubeadm-file-discovery branch from cc08560 to c8631e3 Compare March 25, 2024 10:56
@nicolas-goudry
Copy link
Contributor Author

Some of those scenarios are to be triggered manually I guess it's probably one of them.

Yeah, I figured that, it needs a variable in order to be run, so I removed the option from this test.

All the test that doesn't have all-in-one in their name (or actually the mode var in their test file).

You could pick any non all-in-one test ran that start with packet_ from a pipeline (i.e.: your is there https://gitlab.com/kargo-ci/kubernetes-sigs-kubespray/-/pipelines/1225590732). For instance could be packet-debian12-calico and packet-debian11-calico-upgrade.

Thanks for the details! In the end I enabled the option in three test cases:

  • packet_ubuntu20-calico-all-in-one-hardening
  • packet_ubuntu20-calico-etcd-kubeadm
  • packet_debian11-calico-upgrade

If all tests pass, this should be ready for a last review 🙂

Copy link
Member

@MrFreezeex MrFreezeex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 25, 2024
@yankay
Copy link
Member

yankay commented Apr 3, 2024

Thanks @nicolas-goudry
/lgtm

Copy link
Contributor

@mzaian mzaian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: MrFreezeex, mzaian, nicolas-goudry

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 3, 2024
@k8s-ci-robot k8s-ci-robot merged commit c6fcbf6 into kubernetes-sigs:master Apr 3, 2024
60 checks passed
@nicolas-goudry nicolas-goudry deleted the add-optional-kubeadm-file-discovery branch April 3, 2024 06:54
@sathieu
Copy link
Contributor

sathieu commented Apr 8, 2024

@nicolas-goudry Can this be enabled by default? Or at least documented in hardening.md 🙏

@mzaian
Copy link
Contributor

mzaian commented Apr 8, 2024

Hi sathieu,

I would leave like that but of course we need to add it to the hardening.md and since that you suggested that. How about if you create a PR for this? ;)

@sathieu
Copy link
Contributor

sathieu commented Apr 9, 2024

@mzaian Done in #11068

@mzaian mzaian mentioned this pull request Apr 26, 2024
dibi-codes pushed a commit to fino-digital/kubespray that referenced this pull request May 7, 2024
* feat: add user facing variable with default

* feat: remove rolebinding to anonymous users after init and upgrade

* feat: use file discovery for secondary control plane nodes

* feat: use file discovery for nodes

* fix: do not fail if rolebinding does not exist

* docs: add warning about kube_api_anonymous_auth

* style: improve readability of delegate_to parameter

* refactor: rename discovery kubeconfig file

* test: enable new variable in hardening and upgrade test cases

* docs: add option to config parameters

* test: multiple instances and upgrade
pedro-peter pushed a commit to pedro-peter/kubespray that referenced this pull request May 8, 2024
* feat: add user facing variable with default

* feat: remove rolebinding to anonymous users after init and upgrade

* feat: use file discovery for secondary control plane nodes

* feat: use file discovery for nodes

* fix: do not fail if rolebinding does not exist

* docs: add warning about kube_api_anonymous_auth

* style: improve readability of delegate_to parameter

* refactor: rename discovery kubeconfig file

* test: enable new variable in hardening and upgrade test cases

* docs: add option to config parameters

* test: multiple instances and upgrade
Rickkwa pushed a commit to Rickkwa/kubespray that referenced this pull request Jun 26, 2024
* feat: add user facing variable with default

* feat: remove rolebinding to anonymous users after init and upgrade

* feat: use file discovery for secondary control plane nodes

* feat: use file discovery for nodes

* fix: do not fail if rolebinding does not exist

* docs: add warning about kube_api_anonymous_auth

* style: improve readability of delegate_to parameter

* refactor: rename discovery kubeconfig file

* test: enable new variable in hardening and upgrade test cases

* docs: add option to config parameters

* test: multiple instances and upgrade
davidumea pushed a commit to elastisys/kubespray that referenced this pull request Oct 25, 2024
* feat: add user facing variable with default

* feat: remove rolebinding to anonymous users after init and upgrade

* feat: use file discovery for secondary control plane nodes

* feat: use file discovery for nodes

* fix: do not fail if rolebinding does not exist

* docs: add warning about kube_api_anonymous_auth

* style: improve readability of delegate_to parameter

* refactor: rename discovery kubeconfig file

* test: enable new variable in hardening and upgrade test cases

* docs: add option to config parameters

* test: multiple instances and upgrade
kpoxo6op pushed a commit to kpoxo6op/kubespray that referenced this pull request Dec 27, 2024
* feat: add user facing variable with default

* feat: remove rolebinding to anonymous users after init and upgrade

* feat: use file discovery for secondary control plane nodes

* feat: use file discovery for nodes

* fix: do not fail if rolebinding does not exist

* docs: add warning about kube_api_anonymous_auth

* style: improve readability of delegate_to parameter

* refactor: rename discovery kubeconfig file

* test: enable new variable in hardening and upgrade test cases

* docs: add option to config parameters

* test: multiple instances and upgrade
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants