Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a webhook to reject the gatekeeper-ignore label on non-GK namespaces #350

Merged
merged 19 commits into from
Jan 24, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
91 changes: 77 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -358,6 +358,34 @@ status:
```
> NOTE: The supported enforcementActions are [`deny`, `dryrun`] for constraints. Update the `--disable-enforcementaction-validation=true` flag if the desire is to disable enforcementAction validation against the list of supported enforcementActions.

### Exempting Namespaces from the Gatekeeper Admission Webhook

Note that the following only exempts resources from the admission webhook. They will still be audited. Editing individual constraints is
necessary to exclude them from audit.

If it becomes necessary to exempt a namespace from Gatekeeper entirely (e.g. you want `kube-system` to bypass admission checks), here's how to do it:

1. Make sure the validating admission webhook configuration for Gatekeeper has the following namespace selector:

```yaml
namespaceSelector:
matchExpressions:
- key: admission.gatekeeper.sh/ignore
operator: DoesNotExist
```
the default Gatekeeper manifest should already have added this. The default name for the
webhook configuration is `gatekeeper-validating-webhook-configuration` and the default
name for the webhook that needs the namespace selector is `validation.gatekeeper.sh`

2. Tell Gatekeeper it's okay for the namespace to be ignored by adding a flag to the pod:
`--exempt-namespace=<NAMESPACE NAME>`. This step is necessary because otherwise the
permission to modify a namespace would be equivalent to the permission to exempt everything
in that namespace from policy checks. This way a user must explicitly have permissions
to configure the Gatekeeper pod before they can add exemptions.

3. Add the `admission.gatekeeper.sh/ignore` label to the namespace. The value attached
to the label is ignored, so it can be used to annotate the reason for the exemption.

### Debugging

> NOTE: Verbose logging with DEBUG level can be turned on with `--log-level=DEBUG`. By default, the `--log-level` flag is set to minimum log level `INFO`. Acceptable values for minimum log level are [`DEBUG`, `INFO`, `WARNING`, `ERROR`]. In production, this flag should not be set to `DEBUG`.
Expand Down Expand Up @@ -411,6 +439,55 @@ When applying the constraint using `kubectl apply -f constraint.yaml` with a Con

To find the error, run `kubectl get -f [CONSTRAINT_FILENAME].yaml -oyaml`. Build errors are shown in the `status` field.

### Customizing Admission Behavior

Gatekeeper is a [Kubernetes admission webhook](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#webhook-configuration)
whose default configuration can be found in the `gatekeeper.yaml` manifest file. By default, it is
a `ValidatingWebhookConfiguration` resource named `gatekeeper-validating-webhook-configuration`.

Currently the configuration specifies two webhooks: one for checking a request against
the installed constraints and a second webhook for checking labels on namespace requests
that would result in bypassing constraints for the namespace. The namespace-label webhook
is necessary to prevent a privilege escalation where the permission to add a label to a
namespace is equivalent to the ability to bypass all constraints for that namespace.
You can read more about the ability to exempt namespaces by label [above](#exempting-namespaces-from-the-gatekeeper-admission-webhook).

Because Kubernetes adds features with each version, if you want to know how the webhook can be configured it
is best to look at the official documentation linked at the top of this section. However, two particularly important
configuration options deserve special mention: [timeouts](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#timeouts) and
[failure policy](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#failure-policy).

Timeouts allow you to configure how long the API server will wait for a response from the admission webhook before it
considers the request to have failed. Note that setting the timeout longer than the overall request timeout
means that the main request will time out before the webhook's failure policy is invoked.

Failure policy controls what happens when a webhook fails for whatever reason. Common
failure scenarios include timeouts, a 5xx error from the server or the webhook being unavailable.
You have the option to ignore errors, allowing the request through, or failing, rejecting the request.
This results in a direct tradeoff between availability and enforcement.

Currently Gatekeeper is defaulting to using `Ignore` for the constraint requests. This is because
the webhook server currently only has one instance, which risks downtime during actions like upgrades.
As the theoretical availability improves we will likely change the default to `Fail`.

The namespace label webhook defaults to `Fail`, this is to help ensure that policies preventing
labels that bypass the webhook from being applied are enforced. Because this webhook only gets
called for namespace modification requests, the impact of downtime is mitigated, making the
theoretical maximum availability less of an issue.

Because the manifest is available for customization, the webhook configuration can
be tuned to meet your specific needs if they differ from the defaults.

### Emergency Recovery

If a situation arises where Gatekeeper is preventing the cluster from operating correctly,
the webhook can be disabled. This will remove all Gatekeeper admission checks. Assuming
the default webhook name has been used this can be achieved by running:

`kubectl delete validatingwebhookconfigurations.admissionregistration.k8s.io gatekeeper-validating-webhook-configuration`

Redeploying the webhook configuration will re-enable Gatekeeper.

## Kick The Tires

The [demo/basic](https://github.com/open-policy-agent/gatekeeper/tree/master/demo/basic) directory contains the above examples of simple constraints, templates and configs to play with. The [demo/agilebank](https://github.com/open-policy-agent/gatekeeper/tree/master/demo/agilebank) directory contains more complex examples based on a slightly more realistic scenario. Both folders have a handy demo script to step you through the demos.
Expand All @@ -419,17 +496,6 @@ The [demo/basic](https://github.com/open-policy-agent/gatekeeper/tree/master/dem

## Finalizers

### Why does Gatekeeper add sync finalizers?

When Gatekeeper syncs resources it's adding them to OPA's internal cache. This
cache may be used by constraints to render decisions. Because of this stale data
is bad. It can lead to invalid rejections (e.g. when a uniqueness constraint is
violated because an update conflicts with a since-deleted resource), or invalid
acceptance (e.g. if a constraint uses the cache to make sure a Deployment exists
before a Service can be created). Finalizers help avoid stale state by making
sure Gatekeeper has processed the deletion and removed the object from its cache
before the API Server can garbage collect the object.

### How can I remove finalizers? Why are they hanging around?

If Gatekeeper is running, it should automatically clean up the finalizer. If it
Expand All @@ -448,7 +514,4 @@ If Gatekeeper is not running:
* The container was sent a hard kill signal
* The container had a panic

It is safest to remove the Config resource before uninstalling Gatekeeper, as
that causes finalizers to be removed outside of the normal GC process.

Finalizers can be removed manually via `kubectl edit` or `kubectl patch`
4 changes: 2 additions & 2 deletions config/default/kustomization.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@ namespace: gatekeeper-system
namePrefix: gatekeeper-

# Labels to add to all resources and selectors.
#commonLabels:
# someName: someValue
commonLabels:
gatekeeper.sh/system: "yes"

bases:
- ../crd
Expand Down
2 changes: 2 additions & 0 deletions config/manager/manager.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ kind: Namespace
metadata:
labels:
control-plane: controller-manager
admission.gatekeeper.sh/ignore: no-self-managing
name: system
---
apiVersion: apps/v1
Expand Down Expand Up @@ -30,6 +31,7 @@ spec:
- "--audit-interval=30"
- "--port=8443"
- "--logtostderr"
- "--exempt-namespace=gatekeeper-system"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to gatekeeper-system, I think we should also add kube-system to the default exemption list. I know there are users who have the control-plane label on kube-system today to ensure resources in that namespace are not impacted by Gatekeeper.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm more concerned about unexpectedly opening a security hole for users who haven't already done so.

Users that have set the control-plane label on kube-system will break anyway once the label becomes deprecated. Because we should not modify people's kube-system namespace by default including kube-system as an excused label will be the worst of both worlds: previous exclusion efforts will still fail and users who didn't want to exclude kube-system will unknowingly be vulnerable to people adding the exclusion label.

This was discussed in the meeting 2.5 weeks ago.

image: quay.io/open-policy-agent/gatekeeper:v3.1.0-beta.4
imagePullPolicy: Always
name: manager
Expand Down
18 changes: 18 additions & 0 deletions config/webhook/manifests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,24 @@ metadata:
creationTimestamp: null
name: validating-webhook-configuration
webhooks:
- clientConfig:
caBundle: Cg==
service:
name: webhook-service
namespace: system
path: /v1/admitlabel
failurePolicy: Fail
name: check-ignore-label.gatekeeper.sh
rules:
- apiGroups:
- ""
apiVersions:
- '*'
operations:
- CREATE
- UPDATE
resources:
- namespaces
- clientConfig:
caBundle: Cg==
service:
Expand Down
7 changes: 7 additions & 0 deletions config/webhook/webhook_patch.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,12 @@ webhooks:
timeoutSeconds: 5
namespaceSelector:
matchExpressions:
# using the control-plane label to bypass Gatekeeper is deprecated
maxsmythe marked this conversation as resolved.
Show resolved Hide resolved
# and will be removed from the default config in a future version
- key: control-plane
operator: DoesNotExist
- key: admission.gatekeeper.sh/ignore
operator: DoesNotExist
- name: check-ignore-label.gatekeeper.sh
sideEffects: None
timeoutSeconds: 5
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@ spec:
- --logtostderr
- --constraint-violations-limit={{ .Values.constraintViolationsLimit }}
- --audit-from-cache={{ .Values.auditFromCache }}
- --exempt-namespace=gatekeeper-system
imagePullPolicy: "{{ .Values.image.pullPolicy }}"
image: "{{ .Values.image.repository }}:{{ .Values.image.release }}"
resources: HELMSUBST_DEPLOYMENT_CONTAINER_RESOURCES
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,11 @@ apiVersion: v1
kind: Namespace
metadata:
labels:
admission.gatekeeper.sh/ignore: no-self-managing
app: '{{ template "gatekeeper-operator.name" . }}'
chart: '{{ template "gatekeeper-operator.name" . }}'
control-plane: controller-manager
gatekeeper.sh/system: "yes"
heritage: '{{ .Release.Service }}'
release: '{{ .Release.Name }}'
name: gatekeeper-system
Expand All @@ -20,6 +22,7 @@ metadata:
labels:
app: '{{ template "gatekeeper-operator.name" . }}'
chart: '{{ template "gatekeeper-operator.name" . }}'
gatekeeper.sh/system: "yes"
heritage: '{{ .Release.Service }}'
release: '{{ .Release.Name }}'
name: configs.config.gatekeeper.sh
Expand Down Expand Up @@ -220,6 +223,7 @@ metadata:
labels:
app: '{{ template "gatekeeper-operator.name" . }}'
chart: '{{ template "gatekeeper-operator.name" . }}'
gatekeeper.sh/system: "yes"
heritage: '{{ .Release.Service }}'
release: '{{ .Release.Name }}'
name: gatekeeper-admin
Expand All @@ -232,6 +236,7 @@ metadata:
labels:
app: '{{ template "gatekeeper-operator.name" . }}'
chart: '{{ template "gatekeeper-operator.name" . }}'
gatekeeper.sh/system: "yes"
heritage: '{{ .Release.Service }}'
release: '{{ .Release.Name }}'
name: gatekeeper-manager-role
Expand All @@ -257,6 +262,7 @@ metadata:
labels:
app: '{{ template "gatekeeper-operator.name" . }}'
chart: '{{ template "gatekeeper-operator.name" . }}'
gatekeeper.sh/system: "yes"
heritage: '{{ .Release.Service }}'
release: '{{ .Release.Name }}'
name: gatekeeper-manager-role
Expand Down Expand Up @@ -352,6 +358,7 @@ metadata:
labels:
app: '{{ template "gatekeeper-operator.name" . }}'
chart: '{{ template "gatekeeper-operator.name" . }}'
gatekeeper.sh/system: "yes"
heritage: '{{ .Release.Service }}'
release: '{{ .Release.Name }}'
name: gatekeeper-manager-rolebinding
Expand All @@ -371,6 +378,7 @@ metadata:
labels:
app: '{{ template "gatekeeper-operator.name" . }}'
chart: '{{ template "gatekeeper-operator.name" . }}'
gatekeeper.sh/system: "yes"
heritage: '{{ .Release.Service }}'
release: '{{ .Release.Name }}'
name: gatekeeper-manager-rolebinding
Expand All @@ -389,6 +397,7 @@ metadata:
labels:
app: '{{ template "gatekeeper-operator.name" . }}'
chart: '{{ template "gatekeeper-operator.name" . }}'
gatekeeper.sh/system: "yes"
heritage: '{{ .Release.Service }}'
release: '{{ .Release.Name }}'
name: gatekeeper-webhook-server-cert
Expand All @@ -400,6 +409,7 @@ metadata:
labels:
app: '{{ template "gatekeeper-operator.name" . }}'
chart: '{{ template "gatekeeper-operator.name" . }}'
gatekeeper.sh/system: "yes"
heritage: '{{ .Release.Service }}'
release: '{{ .Release.Name }}'
name: gatekeeper-webhook-service
Expand All @@ -412,6 +422,7 @@ spec:
app: '{{ template "gatekeeper-operator.name" . }}'
chart: '{{ template "gatekeeper-operator.name" . }}'
control-plane: controller-manager
gatekeeper.sh/system: "yes"
heritage: '{{ .Release.Service }}'
release: '{{ .Release.Name }}'
---
Expand All @@ -422,6 +433,7 @@ metadata:
app: '{{ template "gatekeeper-operator.name" . }}'
chart: '{{ template "gatekeeper-operator.name" . }}'
control-plane: controller-manager
gatekeeper.sh/system: "yes"
heritage: '{{ .Release.Service }}'
release: '{{ .Release.Name }}'
name: gatekeeper-controller-manager
Expand All @@ -433,6 +445,7 @@ spec:
app: '{{ template "gatekeeper-operator.name" . }}'
chart: '{{ template "gatekeeper-operator.name" . }}'
control-plane: controller-manager
gatekeeper.sh/system: "yes"
heritage: '{{ .Release.Service }}'
release: '{{ .Release.Name }}'
template:
Expand All @@ -441,6 +454,7 @@ spec:
app: '{{ template "gatekeeper-operator.name" . }}'
chart: '{{ template "gatekeeper-operator.name" . }}'
control-plane: controller-manager
gatekeeper.sh/system: "yes"
heritage: '{{ .Release.Service }}'
release: '{{ .Release.Name }}'
spec:
Expand All @@ -451,6 +465,7 @@ spec:
- --logtostderr
- --constraint-violations-limit={{ .Values.constraintViolationsLimit }}
- --audit-from-cache={{ .Values.auditFromCache }}
- --exempt-namespace=gatekeeper-system
command:
- /manager
env:
Expand Down Expand Up @@ -516,6 +531,7 @@ metadata:
labels:
app: '{{ template "gatekeeper-operator.name" . }}'
chart: '{{ template "gatekeeper-operator.name" . }}'
gatekeeper.sh/system: "yes"
heritage: '{{ .Release.Service }}'
release: '{{ .Release.Name }}'
name: gatekeeper-validating-webhook-configuration
Expand All @@ -532,6 +548,8 @@ webhooks:
matchExpressions:
- key: control-plane
operator: DoesNotExist
- key: admission.gatekeeper.sh/ignore
operator: DoesNotExist
rules:
- apiGroups:
- '*'
Expand All @@ -544,3 +562,23 @@ webhooks:
- '*'
sideEffects: None
timeoutSeconds: 5
- clientConfig:
caBundle: Cg==
service:
name: gatekeeper-webhook-service
namespace: gatekeeper-system
path: /v1/admitlabel
failurePolicy: Fail
name: check-ignore-label.gatekeeper.sh
rules:
- apiGroups:
- ""
apiVersions:
- '*'
operations:
- CREATE
- UPDATE
resources:
- namespaces
sideEffects: None
timeoutSeconds: 5
Loading