Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vulnerable Pods - Discussing the security monitoring and control of pods at runtime #37356

Closed
davidhadas opened this issue Oct 18, 2022 · 12 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. language/en Issues or PRs related to English language lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/security Categorizes an issue or PR as relevant to SIG Security.

Comments

@davidhadas
Copy link
Contributor

davidhadas commented Oct 18, 2022

This is a Feature Request

What would you like to be added
Add security documentation describing the need to monitor and control deployed vulnerable pods.

  1. Highlight the need to cope with pods that are assumed by the user to be perfectly secured but still can be assumed vulnerable to unknown ("Zero-day) vulnerabilities"
  2. Highlight the need to cope with pods that have a known vulnerability (e.g. recent CVE published) until the pod underlying container images (or the pod config) are patched - a process that takes 2 months on average.
  3. Highlight the need to cope with pods that are known to be exploitable (offenders have an effective exploit running against them) without shutting down the service, until a patch is ready
  4. Highlight the need to cope with pods from a replica set where pods are being misused (offenders have a complete attack pattern to compromise a pod and then misuse it) without shutting down the service, until a patch is ready
  5. Point to Kubernetes native solutions and to other open-source solutions

Why is this needed
The entire aspect of the monitoring and control of vulnerable pods is missing from the documentation

Comments
See Monitoring and Controlling Vulnerable Microservices - in this draft we discuss microservices, but the PR will put things in the context of pods, replica sets and deployments.

@davidhadas davidhadas added the kind/feature Categorizes issue or PR as related to a new feature. label Oct 18, 2022
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Oct 18, 2022
@k8s-ci-robot
Copy link
Contributor

@davidhadas: This issue is currently awaiting triage.

SIG Docs takes a lead on issue triage for this website, but any Kubernetes member can accept issues by applying the triage/accepted label.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sftim
Copy link
Contributor

sftim commented Oct 18, 2022

/sig security
/priority awaiting-more-evidence
/language en

  • Why is this new page something that the Kubernetes documentation should cover?
  • How much of that proposed advice is actually not specific to microservices?
  • Does the proposed change align with our content guide?

So far, Kubernetes' documentation does not really discuss microservices. If we change this, let's do that based on a good reason or, failing that, a nice high number of 👍 issue reactions.

@k8s-ci-robot k8s-ci-robot added sig/security Categorizes an issue or PR as relevant to SIG Security. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. language/en Issues or PRs related to English language labels Oct 18, 2022
@davidhadas
Copy link
Contributor Author

davidhadas commented Oct 20, 2022

  • Why is this new page something that the Kubernetes documentation should cover?

Kubernetes, as a system for the management of containerized applications, need to acknowledge the 4 cyber use cases that more and more users face

  • the "Zero day" use case,
  • the use case of a pod in production with known CVE(s),
  • the use case of a pod in production where an effective exploit exists against it, and
  • the use case of pods running in production when offenders have an effective attack pattern to misuse such pods.

This new page provides the necessary clarity about the practicalities of management of containerized applications in 2022 and beyond cyber reality.

  • How much of that proposed advice is actually not specific to microservices?

The proposed advice is nothing to do with microservice specifics.

Yes to the best of my understanding - note that the draft sent is a draft and it is assumed that more work would be needed as part of the PR and if anything is not aligned with the content guide it will be aligned.

So far, Kubernetes' documentation does not really discuss microservices. If we change this, let's do that based on a good reason or, failing that, a nice high number of 👍 issue reactions.

This is probably my bad - I need to find the right phrasing and put things in a context such that it aligns with the community concepts. We can talk about pods and replica sets (and or mentioned deployments) instead of discussing microservices. I will align the issue wording.

@davidhadas davidhadas changed the title Vulnerable microservices - Discussing the security monitoring and control of microservices at runtime Vulnerable Pods - Discussing the security monitoring and control of pods at runtime Oct 20, 2022
@sftim
Copy link
Contributor

sftim commented Oct 20, 2022

Maybe the word “workload” would be more appropriate than “microservice”? Or the phrase “workload component”.

@davidhadas
Copy link
Contributor Author

davidhadas commented Oct 20, 2022

Workload, the way I use it, is a set of microservices that work together for some purpose - for example, if you develop a mobile app with a cloud backend, this backend is probably divided into multiple microservices that work together to do what you need to do at the backend. This set of microservices is a cloud workload. You may have N other workloads that you deploy.

So I would not use workload.

Under Kubernetes - security behavior can be associated with a Kubernetes Service (i.e. the service abstraction and the logical set of Pods that will serve this service for as long as the service exists - pods may be added, removed, or replaced).

Security behavior is expected to monitor and control all requests coming via the service (how is an implementation detail) to achieve the goal of identifying zero-day, attempts to exploit a specific known vulnerability, and attempts to use a specific exploit. It is also expected to monitor and control all the pods presently servicing requests via the service (again regardless of how) to detect which of them are being misused.

@sftim
Copy link
Contributor

sftim commented Oct 22, 2022

This could work as an evergreen blog article. It'd take some effort to make the content timeless / unlikely to go stale in an unhelpful way. However, I think that effort would be worth it.

/remove-priority awaiting-more-evidence

@k8s-ci-robot k8s-ci-robot removed the priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. label Oct 22, 2022
@sftim
Copy link
Contributor

sftim commented Oct 22, 2022

Bear in mind that:

  • a Deployment could run Pods that have different container images, even when its ReplicaSet is stable
    (eg because someone specified :latest as the tag)
  • not all Pod groupings are exposed as a Service
  • a Service might cover more than one application version (for example, a long-lived canary rollout that runs 5% of the new version and 95% of the stable version)
    • and one of those versions might have a vulnerability where the other doesn't
  • application Pods can be part of a Job or StatefulSet (ReplicaSet is not the only game in town)
  • stateful Pods can be exposed via Service too

and that:

  • NetworkPolicy is a thing
  • different container runtimes provide better or worse isolation
  • Pod Security Standards are very relevant to frame controls that mitigate lateral movement

Point to Kubernetes native solutions and to other open-source solutions

Outside of a blog article, we usually wouldn't do that.

@davidhadas
Copy link
Contributor Author

Agreed that a blog post is an important venue here and I will start preparing a draft.
At the same time, I think that the Security concept in Kubernetes should also be updated

@sftim
Copy link
Contributor

sftim commented Oct 24, 2022

@kubernetes/sig-security-leads, would we like to staff this work?

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 22, 2023
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 21, 2023
@davidhadas
Copy link
Contributor Author

See #38104

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. language/en Issues or PRs related to English language lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/security Categorizes an issue or PR as relevant to SIG Security.
Projects
None yet
Development

No branches or pull requests

4 participants