Skip to content

Commit

Permalink
Address liggitts comments
Browse files Browse the repository at this point in the history
  • Loading branch information
sttts committed Apr 30, 2019
1 parent 29c14e7 commit 236e1e0
Showing 1 changed file with 40 additions and 27 deletions.
67 changes: 40 additions & 27 deletions keps/sig-api-machinery/20180731-crd-pruning.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,29 +53,29 @@ CustomResources store arbitrary JSON data without following the typical Kubernet

This KEP proposes to add pruning of all fields which are not specified in the OpenAPI validation schemas given in the CRD.

Pruning will be opt-in in `v1beta1` of `apiextensions.k8s.io` via
Pruning will be opt-in in v1beta1 of `apiextensions.k8s.io` via

```yaml
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
spec:
pruneUnknownFields: true
preserveUnknownFields: false
...
```

i.e. CRDs created in `v1beta1` default to disabled pruning.
i.e. CRDs created in v1beta1 default to disabled pruning.

**Pruning will be enabled by default for CRDs created in `v1`**.
**Pruning will be enabled for every CRDs created in v1** and we will hide `preserveUnknownFields` in v1 in that case.

Pruning requires _structural schemas_ (as described in [KEP Vanilla OpenAPI Subset: Structural Schema](https://github.com/kubernetes/enhancements/pull/1002)) for all defined versions. Validation will reject the CRD with `pruneUnknownFields: true` otherwise.
Pruning can be disabled for subtrees of CustomResources by setting the `x-kubernetes-preserve-unknown-fields: true` vendor extension. This allows to store arbitrary JSON or RawExtensions. This will even be possible at the root level, even in v1, leading to the old behaviour, but explicitly opted-in for every schema and every version, and published in the OpenAPI v2 (and later v3) spec of the API server.

Pruning can be locally disabled for subtrees of CustomResources by setting the `x-kubernetes-preserve-unknown-fields: true` vendor extension. This allows to store arbitrary JSON or RawExtensions.
Pruning requires _structural schemas_ (as described in [KEP Vanilla OpenAPI Subset: Structural Schema](https://github.com/kubernetes/enhancements/pull/1002)) for all defined versions. Validation will reject the CRD with `preserveUnknownFields: false` otherwise.

## Motivation

* Native Golang based resources do pruning as a consequence of the JSON unmarshalling algorithm. This is has become a fundamental behaviour of Kubernetes API semantics that CustomResources break.
* Pruning enforces consistency of data stored in etcd. Objects cannot suddenly render unaccessible because unexpected data breaks decoding.
* Even if unexpected data is of the right type and does not break decoding, it has not gone through admission or validation. Pruning enforces this (scenario: applying a new CR instance with new fields against a cluster with an old CRD manifest).
* Even if unexpected data in etcd is of the right type and does not break decoding, it has not gone through validation, and probably a admission webhook either does not exist for many CRDs or it won't have implemented pruning behaviour. Pruning at the decoding step enforces this (scenario: applying a new CR instance with new fields against a cluster with an old CRD manifest).
* Pruning is a counter-measure to security attacks which make use of knowledge of future versions of APIs with new security relevant fields. Without pruning an attacker can prepare CustomResources with privileged fields set. On version upgrade of the cluster, these fields can suddenly become alive and lead to unallowed behaviour.

### Goals
Expand Down Expand Up @@ -506,25 +506,28 @@ Note that (1) does not apply if `X` and `Y` are the same, compare examples (7)

### Opt-in and Opt-out of Pruning on CRD Level

We will add a `PruneUnknownFields` flag to `CustomResourceDefinitionSpec` of `apiextensions.k8s.io/v1beta1`:
We will add a `preserveUnknownFields` flag to `CustomResourceDefinitionSpec` of `apiextensions.k8s.io/v1beta1`:

```go
type CustomResourceDefinitionSpec struct {
...
// pruneUnknownFields enables pruning of object fields which are not
// PreserveUnknownFields disables pruning of object fields which are not
// specified in the OpenAPI schema. apiVersion, kind, metadata and known
// fields inside metadata are excluded from pruning.
// Defaults to false.
// Setting this field to true is considered an alpha API.
// Note: this will default to true in version v1.
PruneUnknownFields bool
// Defaults to true in v1beta1, and will default to false in v1.
// Setting this field to false is considered an beta API.
PreserveUnknownFields *bool
}
```

I.e. for `apiextensions.k8s.io/v1beta1` this will default to `false` for backwards compatibility.
I.e. for `apiextensions.k8s.io/v1beta1` this will default to true for backwards compatibility.

For `apiextensions.k8s.io/v1` we will change the default to `true` and forbid `false` during creation and updates if it has been `true` before. In `v1` the only way to opt-out from pruning is via setting `x-kubernetes-preserve-unknown-fields: true` in the schema. An empty or unspecified schema will prune everything.
For `apiextensions.k8s.io/v1` we will change the default to false and forbid true during creation and updates if it has been false before. In v1 the only way to opt-out from pruning is via setting `x-kubernetes-preserve-unknown-fields: true` in the schema. An empty or unspecified schema will prune everything.

We will hide `preserveUnknownFields` in v1 objects if it is false.

When CRD authors switch on pruning for a existing CRDs, they are supposed to make their users trigger a data migration of existing objects in etcd, be it via an external migration mechanism, an operator rewriting all objects or manually procedures.

### References

Expand All @@ -536,14 +539,14 @@ For `apiextensions.k8s.io/v1` we will change the default to `true` and forbid `f

**blockers for alpha:**

We default `PruneUnknownFields` to false and hence switch off the whole code path doing pruning. This reduces risk for everybody not using this alpha feature.
We default `preserveUnknownFields` to true and hence switch off the whole code path doing pruning. This reduces risk for everybody not using this feature.

* we add unit tests for the general pruning algorithm
* we add apiextensions-apiserver integration tests to
* verify that the pruning feature is actually off if `PruneUnknownFields` is false.
* verify that `PruneUnknownFields` is defaulted to false.
* verify that pruning happens if `PruneUnknownFields` is true, for all versions in the CRD according to the schema of the respective version.
* verify that `metadata`, `apiVersion`, `kind` are preserved if `PruneUnknownFields` is true and there is no schema given in the CRD.
* verify that the pruning feature is actually off if `preserveUnknownFields` is true.
* verify that `preserveUnknownFields` is defaulted to true.
* verify that pruning happens if `preserveUnknownFields` is false, for all versions in the CRD according to the schema of the respective version.
* verify that `metadata`, `apiVersion`, `kind` are preserved if `preserveUnknownFields` is false and there is no schema given in the CRD.

**blockers for beta:**

Expand All @@ -561,23 +564,33 @@ We default `PruneUnknownFields` to false and hence switch off the whole code pat

### Upgrade / Downgrade Strategy

* setting `PruneUnknownFields` to true is considered alpha quality in 1.15.
* downgrading to 1.14 will lose `pruneUnknownFields: true`, but that's acceptable.
* downgrading from 1.16 (where pruning might be beta) to 1.15 will keep the same behaviour as we don't feature gate `pruneUnknownFields: true`.
* upgrading from 1.14 will default to `pruneUnknownFields: false` and hence change no behaviour.
We aim at implementing this feature right away as beta:

* in order to get users' exposure to the feature with real CustomResourceDefinitions
* because the API surface is tiny such that we don't expect change in that area
* the pruning algorithm is simple enough that we feel confident with thourough test coverage that the risk is small.

Hence, we assume to be at beta in 1.15 and GA in 1.16 guide by the graduation criteria, leading the following upgrade/downgrade strategy:

* setting `preserveUnknownFields` to false is considered beta quality in 1.15.
* downgrading to 1.14 will lose `preserveUnknownFields: false`, but that's acceptable for beta.
* downgrading from 1.16 (where pruning might be GA) to 1.15 will keep the same behaviour as we don't feature gate `preserveUnknownFields: false`.
* upgrading from 1.14 will default to `preserveUnknownFields: true` and hence changes no behaviour.
* upgrading from 1.15 will keep the value and hence change no behaviour.
* when `v1` of `apiextensions.k8s.io` is added, we will keep the old pruning behaviour for CRDs created in `v1beta1` with `pruneUnknownFields: false`, but enforce `pruneUnknownFields: true` for every newly create `v1` CRD. Hence, we keep backwards compatibility.
* when v1 of `apiextensions.k8s.io` is added, we will keep the old pruning behaviour for CRDs created in v1beta1 with `preserveUnknownFields: true`, but forbid `preserveUnknownFields: true` for every newly create v1 CRD. Hence, we keep backwards compatibility.

Tehnically, it is still possible to get the old behaviour even in v1 by setting `x-kubernetes-preserve-unknown-fields: true` at the root level. But we enforce the definition of a schema, at least with this minimal contents.

### Version Skew Strategy

* kubectl is not aware of pruning in relevant way
* posting `pruneUnknownFields: true` alpha quality CRD to an old server will disable pruning. But that's acceptable.
* posting `preserveUnknownFields: false` beta quality CRDs to an old server will disable pruning. But that's acceptable.

## Alternatives Considered

* in [GDoc which preceded this KEP](https://docs.google.com/document/d/1rBn6SZM7NsWxzBN41J2kO2Odf07PeGPygatM_1RwofY/edit#heading=h.4qdisqud6z3t) we considered a number of alternatives, including using a skeleton schema approach. We decided against that because of its complex semantics. In contrast, the _structural schema_ of the [KEP Vanilla OpenAPI Subset: Structural Schema](https://github.com/kubernetes/enhancements/pull/1002) is the natural output of schema generators deriving a schema from Golang structs. This matches the behavior of pruning through JSON unmarshalling, independently of any value validation the developer adds on top.
* we could allow nested `x-kubernetes-preserve-unknown-fields: false`, i.e. to switch on pruning again for a subtree. This might encourage non-Kubernetes-like API types. It is unclear whether there are use-cases we want to support which need this. We can add this in the future.
* we could allow per-version opt-in/out of pruning via `PruneUnknownFields` in `CustomResourceDefinitionVersion`. For the goal of data consistency and security a CRD with semi-enabled pruning does not make much sense. The main reason to not enable pruning will probably be the lack of a complete structural schema. If this is added for one version, it should be possible for all other versions as well as it is less a technical, but a CRD development life-cycle question.
* we could allow per-version opt-in/out of pruning via `preserveUnknownFields` in `CustomResourceDefinitionVersion`. For the goal of data consistency and security a CRD with semi-enabled pruning does not make much sense. The main reason to not enable pruning will probably be the lack of a complete structural schema. If this is added for one version, it should be possible for all other versions as well as it is less a technical, but a CRD development life-cycle question.
* we intensively considered avoiding a new `x-kubernetes-preserve-unknown-fields` vendor extension in favor of recursive `additionalProperties` semantics. We decided against because:

None of OpenAPI v3 schema constructs have effects recursively. We would conflict with that pattern.
Expand Down

0 comments on commit 236e1e0

Please sign in to comment.