Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP: Pruning for CustomResources #2443

Closed

Conversation

sttts
Copy link
Contributor

@sttts sttts commented Jul 31, 2018

No description provided.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jul 31, 2018
@k8s-ci-robot k8s-ci-robot added kind/design Categorizes issue or PR as related to design. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jul 31, 2018
@sttts sttts changed the title Initial transformation of GDoc to markdown WIP: KEP: Pruning and Defaulting for CustomResources Jul 31, 2018
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 31, 2018
@sttts sttts force-pushed the sttts-crd-defaulting-and-pruning branch 2 times, most recently from 054263b to 8a2ffb3 Compare July 31, 2018 18:35
@deads2k
Copy link
Contributor

deads2k commented Jul 31, 2018

@mbohlool @apelisse fyi

@sttts can you replicate the outstanding comments here like mehdy did in #2420 so we can re-find our comments?

4. Only consider `properties` fields outside of `anyOf`, `allOf`, `oneOf`, `not` during pruning, but enforce that every `properties` key inside `anyOf`, `allOf`, `oneOf`, `not` also appears outside all of those.

From these options:
1. leads to fields being kept from pruning although they only appear in branches of the OpenAPI validation schema which do not contribute to validation of an object.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the one that I would expect from pruning. Any field mentioned anywhere is kept.


Even worse: forcing onto another branch might make more defaults applicable. In other words, we would need to run validation with defaulting until it converges. This is a hint that the semantics of defaulting in full OpenAPI schemata are not well defined either.

**We propose to allow defaults only outside of `anyOf`, `allOf`, `oneOf`, `not`.**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I remember being ok with this as a starting point which could later be expanded if we so desired.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a comment.

@sttts sttts force-pushed the sttts-crd-defaulting-and-pruning branch from 8a2ffb3 to c8bf99a Compare August 1, 2018 17:24
@k8s-ci-robot k8s-ci-robot added the sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. label Aug 1, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: bgrant0607

If they are not already assigned, you can assign the PR to them by writing /assign @bgrant0607 in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sttts sttts force-pushed the sttts-crd-defaulting-and-pruning branch 2 times, most recently from b3358d0 to 2fcab11 Compare August 1, 2018 17:29
@sttts sttts changed the title WIP: KEP: Pruning and Defaulting for CustomResources KEP: Pruning and Defaulting for CustomResources Aug 1, 2018
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 1, 2018
>
> Then `a`, `b`, `b.x`, `c`, `d` are specified, `b.y` and `e` are not specified.

**Question:** is it natural semantics to assume `d` to be specified and not to prune it? Note that there is no object with `d` that validates against `"d":{not:{}}`. But due to the `anyOf` there are objects which will validate against the complete schema.
Copy link
Contributor Author

@sttts sttts Aug 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mehdy Bohlool: There should be a difference between not specifying "d" and specifying it like this. Not specifying it means pruning and define it like this means validation fail

Stefan Schimanski: OpenAPI is too powerful: you can specifiy it this way, still the validatio succeeds. That's basically our semantics problem: we have the three cases:

  1. not specified
  2. specified and validating
  3. specified and not validating (but the whole schema validates).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

David Eads: I would expect d to remain even if it is not used to satisfy validation. I expect d to remain because it's potential presence is acknowledged and the fact that the subrule can never be satisfied isn't a level of introspection we should perform.

Antoine Pelisse: +1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allowing "d" to remain seems sufficient for the "security" use case. I don't understand if it is sufficient for the broader "consistency" use case mentioned in the in the overview.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My take on this is that we could go both routes: prune or not prune in these cases. @deads2k's intuition to keep d is much simpler semantics-wise, i.e. you don't have to understand the schema deeply to know what to prune, i.e. this semantics is purely driven by syntax.


Motivated by the examples, we have different options to define the pruning semantics:
1. Use the described semantics of `specified`.
2. Only consider `properties` fields in the schema which actually successfully validate a given object (then `d` would be pruned in example 3).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

David Eads: I'd prefer to not do this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRD authors may add extra validation in an admission webhook. They may move checks between JSON schema and webhooks as is convenient to them.

With that in mind, to implement 2, would you call any validation webhooks during decoding? If so, does it get convoluted to do that from the decoding stage? If not, will it lead to changes in pruning behavior if a CRD author moves where validation for a field is done.

This line of thinking makes me think 2 is too complex.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good point about the webhooks.

Motivated by the examples, we have different options to define the pruning semantics:
1. Use the described semantics of `specified`.
2. Only consider `properties` fields in the schema which actually successfully validate a given object (then `d` would be pruned in example 3).
3. Only consider `properties` fields outside of `anyOf`, `allOf`, `oneOf`, `not` during pruning.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Antoine Pelisse: I don't understand what this means. Does this mean pruning fields in the AllOf (for example), forcing it to fail validation?

Stefan Schimanski: No, it means that we only keep fields specified outside of any of anyOf, allOf, oneOf, not. So if you forget to specify a field outside, but only inside an anyOf, it will be pruned. Probably not what we want.

1. Use the described semantics of `specified`.
2. Only consider `properties` fields in the schema which actually successfully validate a given object (then `d` would be pruned in example 3).
3. Only consider `properties` fields outside of `anyOf`, `allOf`, `oneOf`, `not` during pruning.
4. Only consider `properties` fields outside of `anyOf`, `allOf`, `oneOf`, `not` during pruning, but enforce that every `properties` key inside `anyOf`, `allOf`, `oneOf`, `not` also appears outside all of those.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Antoine Pelisse: Isn't that the same as the one before? Since if you drop the fields in oneOf, allOf, anyOf, they are always going to be invalid, so they have to also appear in properties.

Stefan Schimanski: We would reject the CRD if the outer properties are not a super-set of the ones under all/any/oneOf/not.

So option 4 protects at least the user from mistakes.

Note: all these are not about validating a CR or not, but how to specifiy what you don't want to prune via CRD OpenAPI validation schemata.

An empty object `{}` would be defaulted to `{"a":{"x":42}}`


Obviously, the default values should validate. But OpenAPI validation schemata do not enforce that. Also one default in a branch of `anyOf`, `allOf`, `oneOf` might force another validation run onto another branch of these quantors.
Copy link
Contributor Author

@sttts sttts Aug 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About "default values should validate":

David Eads: this doesn't bother me

Stefan Schimanski: We do a second round of validation in the registry validation step. So we catch them.

About "might force another validation run onto another branch of these quantors

David Eads: This does bother me. cyclical?

Stefan Schimanski: it will converge as the number of branches is finite and defaulting is idem-potent. But it is a strange behaviour nevertheless.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are defaults relevant if we pick option 1? Option 1 seems to only require examining property names, and not their values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We propose below to only allow defaults outside of anyOf, allOf, ... So we don't have to decide about values and which branches they fulfill for them, making application of defaults very simple as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will converge as the number of branches is finite and defaulting is idem-potent. But it is a strange behaviour nevertheless.

I'm pretty sure that this is turing complete or close to it if you can engineer branches that drop fields. Which is more or less what this KEP is proposing.

I really think we need to look hard at the the schema and come up with some rules that prevent people from saying things that they can't say with go types. I have some thoughts on how to do this, but this KEP isn't the place to hash it out. We should NOT be able to get turing-complete behavior by iterated default application.


Even worse: forcing onto another branch might make more defaults applicable. In other words, we would need to run validation with defaulting until it converges. This is a hint that the semantics of defaulting in full OpenAPI schemata are not well defined either.

**We propose to allow defaults only outside of `anyOf`, `allOf`, `oneOf`, `not`.**
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

David Eads: I'm ok with this restriction to start, but I'm also ok with simply running multiple times. Eventually defaulting will stop defaulting.

Stefan Schimanski: also note that this convergence happens at some point. But there is no guarantee that it always converges against the same value (depending on which branches we take first) We should better avoid it altogether. IMO it's a strong hint for bad semantics.

Antoine Pelisse: I agree with Stefan. I'm not sold on the fact that defaults on properties (outside xxxOf/not) is completely solving that problem though?

Antoine Pelisse: To clarify my point, some structures might validate before defaults are applied, some might not validate after, so the order still matters.

Stefan Schimanski: The skeleton has no "require". So if it validates after defaulting, it was validating before as well. So this is fine.

Antoine Pelisse: I don't think that is true because of
"not". Validation can be OK because a value wasn't there (before being defaulted). {"properties": {"a": {"default": 42}}, "anyOf": [{"not": {"properties": {"a": {}}, "additionalProperties": false}}]}

Stefan Schimanski: You claim to have a counterexample of the inverse of my statement :-) Of course, defaulting can make an object invalid. But dropping constrains (including all negative constains (i.e. under not or oneOf)) makes a schema weaker. So the skeleton is weaker than the full schema.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with this, as users can always add defaults later in a webhook if they really need them for xxxOf blocks. (I question the need for not blocks, especially if openAPI is generated from an IDL)

Copy link
Contributor Author

@sttts sttts Aug 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree about not. IDLs (like e.g. Golang types) are probably always strictly-positive. I also question the need for oneOf for the same reason.


Even worse: forcing onto another branch might make more defaults applicable. In other words, we would need to run validation with defaulting until it converges. This is a hint that the semantics of defaulting in full OpenAPI schemata are not well defined either.

**We propose to allow defaults only outside of `anyOf`, `allOf`, `oneOf`, `not`.**
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mehdy Bohlool: why we don't allow it but fail on confusion/conflicts. That means validating the schema before actually using it for defaulting (and even pruning) and while the schema can be a valid OpenAPI schema, we reject it because, for example, it has two defaults for a single value using anyOf.

Stefan Schimanski: the conflicts depend on inputs. Computing the generic overlap for all input is computationally hard and needs complete validation schemata. Structural restriction is far less complex and allow to have unspecified validatoin schemata.

>}
>```
>
> Take an object `{"a": "1", "b":"foo", "c":"bar"}`. Assuming we follow pruning option 1, it is pruned to `{"a": "1", "b":"foo", "c":"bar"}` and to default to `{"a":"1", "b": "foo", "c":"bar", "d": 1}`.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

David Eads: I would not have expected to be pruned. I'd expect c to be kept and d=1 (first rule in what I described).

I'm also ok with restricting this for a first impl.

Antoine Pelisse: how would this work with additionalProperties: false?

Stefan Schimanski: We forbid that additionProperties and properties are used in parallel for the same object/map. So additionalProperties: false means to prune everything.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k added this as a comment further down that this is the starting point with the options for relaxing this later if needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to engineer our schema-acceptance criteria so we never face this question, I think.

>
> Take an object `{"a": "1", "b":"foo", "c":"bar"}`. Assuming we follow pruning option 1, it is pruned to `{"a": "1", "b":"foo", "c":"bar"}` and to default to `{"a":"1", "b": "foo", "c":"bar", "d": 1}`.
>
> But what about the object `{"a": "12", "b":"foo", "c":"bar"}`? Both patterns apply. With pruning option 1 we keep `"b"` and `"c"`. But what about the default? Should we default to `1` or `2`?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mehdy Bohlool: this is another example think can be easily detected and we can reject the schema because of confusion/conflicts in it. It should be clear from the schema what we are pruning and what we are defaulting. We can even try to make a list of allowed field with their default from schema (I believe that's proposal number 1) and if that process failed, we reject the schema.

Stefan Schimanski: I don't think we can easily detect overlaps. E.g. we would have to intersect regular languages (which is possible, but has high complexity).

>}
>```
>
> The type and properties are usually called schema validation, while regex patterns and formats are about values. The latter would be tested in the validation phase, the former would be checked during decoding for native types in the JSON decoder.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mehdy Bohlool: I am not sure about format. it usually uses as an extra identifier with type and may not actually validate the value. Though the border here is blurry. When we say the type is string and the format is date, is it considered a validation or it is the "date" type? My understanding was "type" is primary type and "format" is secondary type.

Stefan Schimanski: Added a paragraph below about our discussion around keeping type and format in the skeleton.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only purpose of pruning that is clear to me is the security one. For that, formatting does not help. type information does not even help. The motivation for adding those during decoding seems to be "similarity to core types", which is not stated as an explicit goal of the design, but seems to be a theme.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. It's mostly about getting similar error messages.

I would claim we have a general, high-level goal for CRDs to behave like native resources. This is not special for pruning and defaulting. Merely, this goal is one driver to implement pruning and defaulting (and many other features we are adding to CRDs).

- "@apelisse"
approvers:
- "@deads2k"
- "@lavalamp - substituted by @fedebongio while @lavalamp is out"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm back, sorry!

@lavalamp
Copy link
Member

Given the difficulty of fixing specs, and the fact that we have tools that should give correct specs, can we prune assuming the spec is correct and generated by kube-openapi / kube-builder? I think we should do this or just not prune at all until we can tell if the spec is acceptable or not.

I'm not in favor of pruning based on specs that use features we don't support.

@deads2k
Copy link
Contributor

deads2k commented Aug 15, 2018

Given the difficulty of fixing specs, and the fact that we have tools that should give correct specs, can we prune assuming the spec is correct and generated by kube-openapi / kube-builder? I think we should do this or just not prune at all until we can tell if the spec is acceptable or not.

I'm not in favor of pruning based on specs that use features we don't support.

Pruning or defaulting? Pruning based on the union of all mentioned fields won't strip any expected fields and will leave the reconciliation of different potential branches to validation where it is more appropriate. Pruning is to prevent field squatting and to have structural consistency in your stored data.

I think the question about "correct" specs is more about defaulting where results become slightly harder to reason about. However, crisp rules about only defaulting on non-branching paths make the behavior predictable and easy to decide about.

Separately, a decision to tighten the rules for specs we allow may be useful and we could have a path forward to to doing that based on status/failure to serve unless the "I know I'm going to be unservable in six months" flag is set. I don't think that we need to block on such a tightening to implement the pruning or defaulting though.

@lavalamp
Copy link
Member

Pruning or defaulting?

Both, if they are writing special code to handle the weird specs--as much as I want defaulting. We shouldn't go fishing while our house is on fire. Honestly I would have blocked adding the validation field in the first place until this was fixed if I had realized how flexible the openapi schema was. I'm also extra interested in schemas being correct due to working on apply.

Pruning based on the union of all mentioned fields won't strip any expected fields and will leave the reconciliation of different potential branches to validation where it is more appropriate.

It might fail to prune items that should be pruned.

Now, I think it is acceptable to not prune at all, either just simply rejecting unknown fields or letting the user deal with the fallout (ok for beta, likely not ok for GA). I also don't object to pruning. I think I object to pruning only most of the time, though.

I think the question about "correct" specs is more about defaulting where results become slightly harder to reason about. However, crisp rules about only defaulting on non-branching paths make the behavior predictable and easy to decide about.

It is certainly laudable that sttts noticed these issues and is working to address them. However I think it is fundamentally wrong that we accept specs that express behaviors we never intend to support or allow. I think that a stopgap solution of writing code that notices and works around the issue is actually worse than doing nothing, because of the precedent it sets (that we have to respond appropriately to every OpenAPI feature). In my view, all of our effort to rectifying the situation needs to go into explicitly defining the subset of specs we will support.

Separately, a decision to tighten the rules for specs we allow may be useful and we could have a path forward to to doing that based on status/failure to serve unless the "I know I'm going to be unservable in six months" flag is set.

#64841 might be homing in on a general solution to that thorny problem. CRD is still beta so we have some wiggle room.

I don't think that we need to block on such a tightening to implement the pruning or defaulting though.

From my perspective, code that notices non-compliant specs and doesn't act on them (ideally getting an error somewhere visible) is fine and possibly our only choice given that tightening the validation will be a multi-release effort.

But I view code that attempts to repair/skeletonize/reinterpret non-compliant OpenAPI specs and then use them anyway as detrimental. Code like that is incredibly difficult to get correct in my experience, and incredibly frustrating when it does the wrong thing. I especially object if the code is silent and doesn't give users any indication that they're working off of our non-standard interpretation of their OpenAPI spec.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Aug 22, 2018
@k8s-ci-robot
Copy link
Contributor

@sttts: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 22, 2018
@sttts sttts changed the title KEP: Pruning and Defaulting for CustomResources KEP: Pruning for CustomResources Aug 22, 2018
@sttts sttts force-pushed the sttts-crd-defaulting-and-pruning branch 4 times, most recently from b374868 to e93e696 Compare August 22, 2018 15:09
@sttts sttts force-pushed the sttts-crd-defaulting-and-pruning branch from e93e696 to 2322c20 Compare August 22, 2018 15:10
@k8s-ci-robot
Copy link
Contributor

@sttts: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-community-verify 2322c20 link /test pull-community-verify

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@sttts
Copy link
Contributor Author

sttts commented Aug 28, 2018

However I think it is fundamentally wrong that we accept specs that express behaviors we never intend to support or allow.

Pruning based on the union of all mentioned fields won't strip any expected fields and will leave the reconciliation of different potential branches to validation where it is more appropriate.

It might fail to prune items that should be pruned.

Now, I think it is acceptable to not prune at all, either just simply rejecting unknown fields or letting the user deal with the fallout (ok for beta, likely not ok for GA). I also don't object to pruning. I think I object to pruning only most of the time, though.

I think the question about "correct" specs is more about defaulting where results become slightly harder to reason about. However, crisp rules about only defaulting on non-branching paths make the behavior predictable and easy to decide about.

It is certainly laudable that sttts noticed these issues and is working to address them. However I think it is fundamentally wrong that we accept specs that express behaviors we never intend to support or allow. I think that a stopgap solution of writing code that notices and works around the issue is actually worse than doing nothing, because of the precedent it sets (that we have to respond appropriately to every OpenAPI feature). In my view, all of our effort to rectifying the situation needs to go into explicitly defining the subset of specs we will support.

I have tried hard to follow these harsh judgements about the approach, and I would have hoped to have an open and also early-on discussion about these instead of this form. And not surprisingly I disagree with them:

  • OpenAPI validation is and always has been about the validation step in our stack, not about decoding. Today's validation code for native resources does a lot of crazy things and the language we use there (Golang) is turing complete by nature. Hence, people can formulate any crazy restrictions on objects and they do. Our OpenAPI validation schema language gives us some limited way to define validations without writing code. With admission webhooks everybody is able to define even more crazy APIs without any limitation.

    In short: it does not matter that OpenAPI validation allows to restrict APIs in ways which do not comply with our API conventions.

  • Without pruning people define tons of APIs today which do not follow our conventions. With versioning we allow them to turn them into beta and GA APIs. Not having pruning is dangerous if we care about API conventions.

  • I claim that pruning using the skeleton schema enforces sane Golang-like schemata, without any embedded propositional logic.

    It might fail to prune items that should be pruned.

    Which are they? I read from the comments that there is either fundamental disagreement about what is intuitively expected from pruning or misunderstanding of the approach. Would be happy to understand.

  • If we start inventing our own small schema language, we will end up enforcing a schema like the skeleton, taking away expressivity from the user for more advanced validations. This is fine. But it will force ppl towards writing admission webhooks for anything more advanced. But how does it help to enforce API convention (i.e. sane Kubernetes-like APIs)? IMO it won't either, see point 1. Users need expressivity and expressivity allows them to shoot themselves into their foot. Whether expressibity is called OpenAPI or admission webhook does not matter.

@justaugustus
Copy link
Member

/kind kep

@mattfarina
Copy link
Contributor

There is an element of end-user experience in this one. What do we want them to experience?

If we value consistency, the default experience for custom resources should be the same as built-in resources. If we apply the principle of least astonishment what should the experience be and how do we build an implementation from there?

@liggitt
Copy link
Member

liggitt commented Nov 9, 2018

There is an element of end-user experience in this one. What do we want them to experience?

If we value consistency, the default experience for custom resources should be the same as built-in resources. If we apply the principle of least astonishment what should the experience be and how do we build an implementation from there?

We value both consistency and experience.

CRDs should behave consistently with built-in resources.

If there are experience issues with some of the built-in resource behaviors, we should determine if we can improve those (and ensure consistent treatment for CRDs as well as we do so).

@erictune
Copy link
Member

erictune commented Nov 9, 2018

Recording this for posterity....
CRD Conversion it set to go into v1.13. This KEP is not going to. This is okay for the following reasons:

  • CRD Conversion is important to a broad set of CRD users. Pruning, less so.
    • I've spoken to a number of sophisticated CRD users. They included:
      • contributors to what was then known as CoreOS Operators
      • Istio contributors
      • Knative contributors
    • All said that CRD conversion was an important and urgent feature. None said that Pruning is an urgent feature. (However, some core type developers said pruning was essential for security sensitive use cases.)
  • Most CRDs cannot benefit from this KEP. But all can benefit from CRD Conversion.
    • 90% of CRDs on Github didn't use .spec.validation, in a survey I did of over 400 CRDs. This doesn't mean they don't do some form of validation - they may validate in an admission webhook, or the controller may set error conditions in the status for invalid CRs. CRD Conversion does not depend on having a .spec.validation, so it benefits all CRD users.
  • We can add this KEP later.
    • Pruning can be enabled as a flag per-apiversion, later on. (It is a per-CRD flag in this KEP, but it could easily be per apiversion.)
    • Suppose The User has a CRD with version v1, which uses validation, but of course no pruning. Later, pruning support is added to CRDs in Kubernetes version v1.x.0. Now, The User adds version v2, and enables pruning for version v2. Pruning is per apiVersion, so this should work. And v1 clients with incorrect fields will still keep working.
  • There isn't consensus about pruning. However, support for apiVersion conversion appears to be universal.
  • In hindsight, some Kubernetes API experts feel it is still the right behavior, while others think we should instead have done to rejection for unknown fields. Not listing the arguments here.

I summary, we move forward where there is greatest demand and certainty, without blocking future options.

@bgrant0607
Copy link
Member

I agree with @erictune. Pruning can be added later.

CRDs, especially schema-less ones, are like annotations in many ways.

Related past discussions:
kubernetes/kubernetes#30819
kubernetes/kubernetes#5889

Either strict validation OR pruning can prevent the problems of circumventing validation, creating timebombs, and confusing controllers.

Each has different gotchas for clients.

@bgrant0607
Copy link
Member

And, for the record, while I haven't had time to read this KEP, I am in favor of pruning, API consistency, and declarative schema specification. But I don't think pruning can be required for all CRDs.

@justaugustus
Copy link
Member

REMINDER: KEPs are moving to k/enhancements on November 30. Please attempt to merge this KEP before then to signal consensus.
For more details on this change, review this thread.

Any questions regarding this move should be directed to that thread and not asked on GitHub.

@justaugustus
Copy link
Member

KEPs have moved to k/enhancements.
This PR will be closed and any additional changes to this KEP should be submitted to k/enhancements.
For more details on this change, review this thread.

Any questions regarding this move should be directed to that thread and not asked on GitHub.
/close

@k8s-ci-robot
Copy link
Contributor

@justaugustus: Closed this PR.

In response to this:

KEPs have moved to k/enhancements.
This PR will be closed and any additional changes to this KEP should be submitted to k/enhancements.
For more details on this change, review this thread.

Any questions regarding this move should be directed to that thread and not asked on GitHub.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/design Categorizes issue or PR as related to design. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants