A proposal for refactoring SecurityContext
to have pod-level and container-level attributes in
order to correctly model pod- and container-level security concerns.
Currently, containers have a SecurityContext
attribute which contains information about the
security settings the container uses. In practice, many of these attributes are uniform across all
containers in a pod. Simultaneously, there is also a need to apply the security context pattern
at the pod level to correctly model security attributes that apply only at a pod level.
Users should be able to:
- Express security settings that are applicable to the entire pod
- Express base security settings that apply to all containers
- Override only the settings that need to be differentiated from the base in individual containers
This proposal is a dependency for other changes related to security context:
Goals of this design:
- Describe the use cases for which a pod-level security context is necessary
- Thoroughly describe the API backward compatibility issues that arise from the introduction of a pod-level security context
- Describe all implementation changes necessary for the feature
- We will not design for intra-pod security; we are not currently concerned about isolating containers in the same pod from one another
- We will design for backward compatibility with the current V1 API
- As a developer, I want to correctly model security attributes which belong to an entire pod
- As a user, I want to be able to specify container attributes that apply to all containers without repeating myself
- As an existing user, I want to be able to use the existing container-level security API
Some security attributes make sense only to model at the pod level. For example, it is a
fundamental property of pods that all containers in a pod share the same network namespace.
Therefore, using the host namespace makes sense to model at the pod level only, and indeed, today
it is part of the PodSpec
. Other host namespace support is currently being added and these will
also be pod-level settings; it makes sense to model them as a pod-level collection of security
attributes.
Some use cases require the containers in a pod to run with different security settings. As an example, a user may want to have a pod with two containers, one of which runs as root with the privileged setting, and one that runs as a non-root UID. To support use cases like this, it should be possible to override appropriate (ie, not intrinsically pod-level) security settings for individual containers.
For posterity and ease of reading, note the current state of SecurityContext
:
package api
type Container struct {
// Other fields omitted
// Optional: SecurityContext defines the security options the pod should be run with
SecurityContext *SecurityContext `json:"securityContext,omitempty"`
}
type SecurityContext struct {
// Capabilities are the capabilities to add/drop when running the container
Capabilities *Capabilities `json:"capabilities,omitempty"`
// Run the container in privileged mode
Privileged *bool `json:"privileged,omitempty"`
// SELinuxOptions are the labels to be applied to the container
// and volumes
SELinuxOptions *SELinuxOptions `json:"seLinuxOptions,omitempty"`
// RunAsUser is the UID to run the entrypoint of the container process.
RunAsUser *int64 `json:"runAsUser,omitempty"`
// RunAsNonRoot indicates that the container should be run as a non-root user. If the RunAsUser
// field is not explicitly set then the kubelet may check the image for a specified user or
// perform defaulting to specify a user.
RunAsNonRoot bool `json:"runAsNonRoot,omitempty"`
}
// SELinuxOptions contains the fields that make up the SELinux context of a container.
type SELinuxOptions struct {
// SELinux user label
User string `json:"user,omitempty"`
// SELinux role label
Role string `json:"role,omitempty"`
// SELinux type label
Type string `json:"type,omitempty"`
// SELinux level label.
Level string `json:"level,omitempty"`
}
PodSecurityContext
specifies two types of security attributes:
- Attributes that apply to the pod itself
- Attributes that apply to the containers of the pod
In the internal API, fields of the PodSpec
controlling the use of the host PID, IPC, and network
namespaces are relocated to this type:
package api
type PodSpec struct {
// Other fields omitted
// Optional: SecurityContext specifies pod-level attributes and container security attributes
// that apply to all containers.
SecurityContext *PodSecurityContext `json:"securityContext,omitempty"`
}
// PodSecurityContext specifies security attributes of the pod and container attributes that apply
// to all containers of the pod.
type PodSecurityContext struct {
// Use the host's network namespace. If this option is set, the ports that will be
// used must be specified.
// Optional: Default to false.
HostNetwork bool
// Use the host's IPC namespace
HostIPC bool
// Use the host's PID namespace
HostPID bool
// Capabilities are the capabilities to add/drop when running containers
Capabilities *Capabilities `json:"capabilities,omitempty"`
// Run the container in privileged mode
Privileged *bool `json:"privileged,omitempty"`
// SELinuxOptions are the labels to be applied to the container
// and volumes
SELinuxOptions *SELinuxOptions `json:"seLinuxOptions,omitempty"`
// RunAsUser is the UID to run the entrypoint of the container process.
RunAsUser *int64 `json:"runAsUser,omitempty"`
// RunAsNonRoot indicates that the container should be run as a non-root user. If the RunAsUser
// field is not explicitly set then the kubelet may check the image for a specified user or
// perform defaulting to specify a user.
RunAsNonRoot bool
}
// Comments and generated docs will change for the container.SecurityContext field to indicate
// the precedence of these fields over the pod-level ones.
type Container struct {
// Other fields omitted
// Optional: SecurityContext defines the security options the pod should be run with.
// Settings specified in this field take precedence over the settings defined in
// pod.Spec.SecurityContext.
SecurityContext *SecurityContext `json:"securityContext,omitempty"`
}
In the V1 API, the pod-level security attributes which are currently fields of the PodSpec
are
retained on the PodSpec
for backward compatibility purposes:
package v1
type PodSpec struct {
// Other fields omitted
// Use the host's network namespace. If this option is set, the ports that will be
// used must be specified.
// Optional: Default to false.
HostNetwork bool `json:"hostNetwork,omitempty"`
// Use the host's pid namespace.
// Optional: Default to false.
HostPID bool `json:"hostPID,omitempty"`
// Use the host's ipc namespace.
// Optional: Default to false.
HostIPC bool `json:"hostIPC,omitempty"`
// Optional: SecurityContext specifies pod-level attributes and container security attributes
// that apply to all containers.
SecurityContext *PodSecurityContext `json:"securityContext,omitempty"`
}
The pod.Spec.SecurityContext
specifies the security context of all containers in the pod.
The containers' securityContext
field is overlaid on the base security context to determine the
effective security context for the container.
The new V1 API should be backward compatible with the existing API. Backward compatibility is defined as:
- Any API call (e.g. a structure POSTed to a REST endpoint) that worked before your change must work the same after your change.
- Any API call that uses your change must not cause problems (e.g. crash or degrade behavior) when issued against servers that do not include your change.
- It must be possible to round-trip your change (convert to different API versions and back) with no loss of information.
Previous versions of this proposal attempted to deal with backward compatibility by defining the affect of setting the pod-level fields on the container-level fields. While trying to find consensus on this design, it became apparent that this approach was going to be extremely complex to implement, explain, and support. Instead, we will approach backward compatibility as follows:
- Pod-level and container-level settings will not affect one another
- Old clients will be able to use container-level settings in the exact same way
- Container level settings always override pod-level settings if they are set
-
Old client using
pod.Spec.Containers[x].SecurityContext
An old client creates a pod:
apiVersion: v1 kind: Pod metadata: name: test-pod spec: containers: - name: a securityContext: runAsUser: 1001 - name: b securityContext: runAsUser: 1002
looks to old clients like:
apiVersion: v1 kind: Pod metadata: name: test-pod spec: containers: - name: a securityContext: runAsUser: 1001 - name: b securityContext: runAsUser: 1002
looks to new clients like:
apiVersion: v1 kind: Pod metadata: name: test-pod spec: containers: - name: a securityContext: runAsUser: 1001 - name: b securityContext: runAsUser: 1002
-
New client using
pod.Spec.SecurityContext
A new client creates a pod using a field of
pod.Spec.SecurityContext
:apiVersion: v1 kind: Pod metadata: name: test-pod spec: securityContext: runAsUser: 1001 containers: - name: a - name: b
appears to new clients as:
apiVersion: v1 kind: Pod metadata: name: test-pod spec: securityContext: runAsUser: 1001 containers: - name: a - name: b
old clients will see:
apiVersion: v1 kind: Pod metadata: name: test-pod spec: containers: - name: a - name: b
-
Pods created using
pod.Spec.SecurityContext
andpod.Spec.Containers[x].SecurityContext
If a field is set in both
pod.Spec.SecurityContext
andpod.Spec.Containers[x].SecurityContext
, the value inpod.Spec.Containers[x].SecurityContext
wins. In the following pod:apiVersion: v1 kind: Pod metadata: name: test-pod spec: securityContext: runAsUser: 1001 containers: - name: a securityContext: runAsUser: 1002 - name: b
The effective setting for
runAsUser
for container A is1002
.
A backward compatibility test suite will be established for the v1 API. The test suite will verify compatibility by converting objects into the internal API and back to the version API and examining the results.
All of the examples here will be used as test-cases. As more test cases are added, the proposal will be updated.
An example of a test like this can be found in the OpenShift API package
E2E test cases will be added to test the correct determination of the security context for containers.
- The Kubelet will use the new fields on the
PodSecurityContext
for host namespace control - The Kubelet will be modified to correctly implement the backward compatibility and effective security context determination defined here