Vulnerable Pods - Discussing the security monitoring and control of pods at runtime #37356

davidhadas · 2022-10-18T08:00:29Z

This is a Feature Request

What would you like to be added
Add security documentation describing the need to monitor and control deployed vulnerable pods.

Highlight the need to cope with pods that are assumed by the user to be perfectly secured but still can be assumed vulnerable to unknown ("Zero-day) vulnerabilities"
Highlight the need to cope with pods that have a known vulnerability (e.g. recent CVE published) until the pod underlying container images (or the pod config) are patched - a process that takes 2 months on average.
Highlight the need to cope with pods that are known to be exploitable (offenders have an effective exploit running against them) without shutting down the service, until a patch is ready
Highlight the need to cope with pods from a replica set where pods are being misused (offenders have a complete attack pattern to compromise a pod and then misuse it) without shutting down the service, until a patch is ready
Point to Kubernetes native solutions and to other open-source solutions

Why is this needed
The entire aspect of the monitoring and control of vulnerable pods is missing from the documentation

Comments
See Monitoring and Controlling Vulnerable Microservices - in this draft we discuss microservices, but the PR will put things in the context of pods, replica sets and deployments.

k8s-ci-robot · 2022-10-18T08:00:37Z

@davidhadas: This issue is currently awaiting triage.

SIG Docs takes a lead on issue triage for this website, but any Kubernetes member can accept issues by applying the triage/accepted label.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

sftim · 2022-10-18T18:15:13Z

/sig security
/priority awaiting-more-evidence
/language en

Why is this new page something that the Kubernetes documentation should cover?
How much of that proposed advice is actually not specific to microservices?
Does the proposed change align with our content guide?

So far, Kubernetes' documentation does not really discuss microservices. If we change this, let's do that based on a good reason or, failing that, a nice high number of 👍 issue reactions.

davidhadas · 2022-10-20T08:14:07Z

Why is this new page something that the Kubernetes documentation should cover?

Kubernetes, as a system for the management of containerized applications, need to acknowledge the 4 cyber use cases that more and more users face

the "Zero day" use case,
the use case of a pod in production with known CVE(s),
the use case of a pod in production where an effective exploit exists against it, and
the use case of pods running in production when offenders have an effective attack pattern to misuse such pods.

This new page provides the necessary clarity about the practicalities of management of containerized applications in 2022 and beyond cyber reality.

How much of that proposed advice is actually not specific to microservices?

The proposed advice is nothing to do with microservice specifics.

Does the proposed change align with our content guide?

Yes to the best of my understanding - note that the draft sent is a draft and it is assumed that more work would be needed as part of the PR and if anything is not aligned with the content guide it will be aligned.

So far, Kubernetes' documentation does not really discuss microservices. If we change this, let's do that based on a good reason or, failing that, a nice high number of 👍 issue reactions.

This is probably my bad - I need to find the right phrasing and put things in a context such that it aligns with the community concepts. We can talk about pods and replica sets (and or mentioned deployments) instead of discussing microservices. I will align the issue wording.

sftim · 2022-10-20T09:34:06Z

Maybe the word “workload” would be more appropriate than “microservice”? Or the phrase “workload component”.

davidhadas · 2022-10-20T12:53:08Z

Workload, the way I use it, is a set of microservices that work together for some purpose - for example, if you develop a mobile app with a cloud backend, this backend is probably divided into multiple microservices that work together to do what you need to do at the backend. This set of microservices is a cloud workload. You may have N other workloads that you deploy.

So I would not use workload.

Under Kubernetes - security behavior can be associated with a Kubernetes Service (i.e. the service abstraction and the logical set of Pods that will serve this service for as long as the service exists - pods may be added, removed, or replaced).

Security behavior is expected to monitor and control all requests coming via the service (how is an implementation detail) to achieve the goal of identifying zero-day, attempts to exploit a specific known vulnerability, and attempts to use a specific exploit. It is also expected to monitor and control all the pods presently servicing requests via the service (again regardless of how) to detect which of them are being misused.

sftim · 2022-10-22T01:15:47Z

This could work as an evergreen blog article. It'd take some effort to make the content timeless / unlikely to go stale in an unhelpful way. However, I think that effort would be worth it.

/remove-priority awaiting-more-evidence

sftim · 2022-10-22T01:21:16Z

Bear in mind that:

a Deployment could run Pods that have different container images, even when its ReplicaSet is stable
(eg because someone specified :latest as the tag)
not all Pod groupings are exposed as a Service
a Service might cover more than one application version (for example, a long-lived canary rollout that runs 5% of the new version and 95% of the stable version)
- and one of those versions might have a vulnerability where the other doesn't
application Pods can be part of a Job or StatefulSet (ReplicaSet is not the only game in town)
stateful Pods can be exposed via Service too

and that:

NetworkPolicy is a thing
different container runtimes provide better or worse isolation
Pod Security Standards are very relevant to frame controls that mitigate lateral movement

Point to Kubernetes native solutions and to other open-source solutions

Outside of a blog article, we usually wouldn't do that.

davidhadas · 2022-10-22T16:15:12Z

Agreed that a blog post is an important venue here and I will start preparing a draft.
At the same time, I think that the Security concept in Kubernetes should also be updated

sftim · 2022-10-24T18:24:46Z

@kubernetes/sig-security-leads, would we like to staff this work?

k8s-triage-robot · 2023-01-22T18:50:46Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2023-02-21T19:14:45Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

davidhadas · 2023-02-21T20:51:37Z

See #38104

davidhadas added the kind/feature Categorizes issue or PR as related to a new feature. label Oct 18, 2022

k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Oct 18, 2022

k8s-ci-robot added sig/security Categorizes an issue or PR as relevant to SIG Security. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. language/en Issues or PRs related to English language labels Oct 18, 2022

davidhadas changed the title ~~Vulnerable microservices - Discussing the security monitoring and control of microservices at runtime~~ Vulnerable Pods - Discussing the security monitoring and control of pods at runtime Oct 20, 2022

k8s-ci-robot removed the priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. label Oct 22, 2022

davidhadas mentioned this issue Nov 27, 2022

Blog Post: Vulnerable Microservices #38104

Closed

davidhadas mentioned this issue Jan 13, 2023

Add blog article about microservices, vulnerabilities, and Guard #38918

Merged

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 22, 2023

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 21, 2023

davidhadas closed this as completed Feb 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vulnerable Pods - Discussing the security monitoring and control of pods at runtime #37356

Vulnerable Pods - Discussing the security monitoring and control of pods at runtime #37356

davidhadas commented Oct 18, 2022 •

edited

Loading

k8s-ci-robot commented Oct 18, 2022

sftim commented Oct 18, 2022

davidhadas commented Oct 20, 2022 •

edited

Loading

sftim commented Oct 20, 2022

davidhadas commented Oct 20, 2022 •

edited

Loading

sftim commented Oct 22, 2022

sftim commented Oct 22, 2022

davidhadas commented Oct 22, 2022

sftim commented Oct 24, 2022

k8s-triage-robot commented Jan 22, 2023

k8s-triage-robot commented Feb 21, 2023

davidhadas commented Feb 21, 2023

Vulnerable Pods - Discussing the security monitoring and control of pods at runtime #37356

Vulnerable Pods - Discussing the security monitoring and control of pods at runtime #37356

Comments

davidhadas commented Oct 18, 2022 • edited Loading

k8s-ci-robot commented Oct 18, 2022

sftim commented Oct 18, 2022

davidhadas commented Oct 20, 2022 • edited Loading

sftim commented Oct 20, 2022

davidhadas commented Oct 20, 2022 • edited Loading

sftim commented Oct 22, 2022

sftim commented Oct 22, 2022

davidhadas commented Oct 22, 2022

sftim commented Oct 24, 2022

k8s-triage-robot commented Jan 22, 2023

k8s-triage-robot commented Feb 21, 2023

davidhadas commented Feb 21, 2023

davidhadas commented Oct 18, 2022 •

edited

Loading

davidhadas commented Oct 20, 2022 •

edited

Loading

davidhadas commented Oct 20, 2022 •

edited

Loading