Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Add Kubernetes 1.22 to CI (Debug status subresource admission errors) #11805

Closed
wants to merge 5 commits into from

Conversation

julz
Copy link
Member

@julz julz commented Aug 16, 2021

No description provided.

@knative-prow-robot knative-prow-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 16, 2021
@google-cla google-cla bot added the cla: yes Indicates the PR's author has signed the CLA. label Aug 16, 2021
@knative-prow-robot knative-prow-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Aug 16, 2021
@knative-prow-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: julz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow-robot knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 16, 2021
@codecov
Copy link

codecov bot commented Aug 16, 2021

Codecov Report

Merging #11805 (27cd6f8) into main (d0d882f) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##             main   #11805   +/-   ##
=======================================
  Coverage   87.81%   87.81%           
=======================================
  Files         196      196           
  Lines        9394     9394           
=======================================
  Hits         8249     8249           
  Misses        889      889           
  Partials      256      256           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d0d882f...27cd6f8. Read the comment docs.

@julz
Copy link
Member Author

julz commented Aug 16, 2021

well, looks like good news we can reproduce a problem, bad news we have a problem!

cc @dprotaso @markusthoemmes fyi

@julz
Copy link
Member Author

julz commented Aug 16, 2021

from the logs, consistent with #11448:

stream.go:254: W 10:44:24.545 controller-744c497b75-tk22r [knative.dev.serving.pkg.reconciler.revision.Reconciler] [serving-tests/should-have-stdin-e-o-f-sfngfjje-00001] Failed to update resource status err=admission webhook "webhook.serving.knative.dev" denied the request: mutation failed: cannot decode incoming new object: json: unknown field "subresource"

@julz
Copy link
Member Author

julz commented Aug 16, 2021

looks like the error is probably coming from somewhere around here. Wonder if pre-1.22 subresources weren't sent to web hooks and now they are?

@knative-prow-robot knative-prow-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. area/API API objects and controllers and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Aug 16, 2021
@julz julz changed the title WIP: Add Kubernetes 1.22 to CI WIP: Add Kubernetes 1.22 to CI (Debug status subresource admission errors) Aug 16, 2021
@julz julz force-pushed the desmond-kubernetes-one-point-twotwo branch from 5f3a248 to 53e6a08 Compare August 16, 2021 13:03
@julz julz force-pushed the desmond-kubernetes-one-point-twotwo branch from 53e6a08 to 27cd6f8 Compare August 16, 2021 14:41
@knative-prow-robot
Copy link
Contributor

@julz: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Rerun command
pull-knative-serving-build-tests 27cd6f8 link /test pull-knative-serving-build-tests

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@julz
Copy link
Member Author

julz commented Aug 16, 2021

looks like managedFields now has a subresources field as of kubernetes/kubernetes#100970 which our types dont have. I guess we may need to bump our k8s libs (or hack a workaround) to support 1.22 🤔

@julz
Copy link
Member Author

julz commented Aug 16, 2021

tho seems like we'd need at least 1.22 libs to help here since that's when the field arrived

@cardil
Copy link

cardil commented Aug 16, 2021

I think this needs deps on knative/pkg to be aligned with k8s 0.22+ deps - see: knative/pkg#2145

See also a client problems I've occurred bumping up to newer version knative/client#1209

@julz
Copy link
Member Author

julz commented Aug 16, 2021

right @cardil but Im not sure we can bump to 1.22 due to needing to support previous 3 versions of k8s (we can go to 1.21 since our next release will support 1.20+). @dprotaso suggests we could ask k8s for a cherry-pick of the field to 1.21.

@markusthoemmes
Copy link
Contributor

The offender is actually

serving/cmd/webhook/main.go

Lines 121 to 122 in 6a54ed8

// Whether to disallow unknown fields.
true,

which causes

https://github.com/knative/pkg/blob/bf176d5654b9edf903841dd38fc36c56f12ddd4e/webhook/resourcesemantics/validation/validation_admit.go#L114-L116

To throw us out. Veeeeery unfortunate as this is actually designed to only disallow fields in "our" types. Flipping this flag makes everything (at least on the surface) work as expected.

Thinking about the least invasive fix for this...

@markusthoemmes
Copy link
Contributor

Confirmed that patching the Subresource string field into ManagedFieldsEntry makes this fly without any other changes. That's akin to backport the field to 1.21 (or even 1.20). That might be a feasible option for anybody who'd need this to fly quickly.

diff --git a/vendor/k8s.io/apimachinery/pkg/apis/meta/v1/types.go b/vendor/k8s.io/apimachinery/pkg/apis/meta/v1/types.go
index d84878d7c..522336cba 100644
--- a/vendor/k8s.io/apimachinery/pkg/apis/meta/v1/types.go
+++ b/vendor/k8s.io/apimachinery/pkg/apis/meta/v1/types.go
@@ -1158,6 +1158,15 @@ type ManagedFieldsEntry struct {
 	// FieldsV1 holds the first JSON version format as described in the "FieldsV1" type.
 	// +optional
 	FieldsV1 *FieldsV1 `json:"fieldsV1,omitempty" protobuf:"bytes,7,opt,name=fieldsV1"`
+
+	// Subresource is the name of the subresource used to update that object, or
+	// empty string if the object was updated through the main resource. The
+	// value of this field is used to distinguish between managers, even if they
+	// share the same name. For example, a status update will be distinct from a
+	// regular update using the same manager name.
+	// Note that the APIVersion field is not related to the Subresource field and
+	// it always corresponds to the version of the main resource.
+	Subresource string `json:"subresource,omitempty" protobuf:"bytes,8,opt,name=subresource"`
 }
 
 // ManagedFieldsOperationType is the type of operation which lead to a ManagedFieldsEntry being created.

@julz
Copy link
Member Author

julz commented Aug 17, 2021

Slack thread here where we landed on a different approach but.. I personally don't actually hate the idea of patching that much. It'll disappear once we get a cherry pick 🤷

@julz
Copy link
Member Author

julz commented Aug 23, 2021

This has served its purpose: we understand the problem and have various ideas for how to fix it. I think the easiest/simplest thing for serving (although perhaps not Tekton, unfortunately, so we'll need to think about that) is to set DisallowUnknownFields back to false since we have schema validation performing the same job now, and that seems to best match the semantics and intended usage of upstream k8s, based on our conversations with them. I'll open a separate PR for us to discuss that.

/close

@knative-prow-robot
Copy link
Contributor

@julz: Closed this PR.

In response to this:

This has served its purpose: we understand the problem and have various ideas for how to fix it. I think the easiest/simplest thing for serving (although perhaps not Tekton, unfortunately, so we'll need to think about that) is to set DisallowUnknownFields back to false since we have schema validation performing the same job now, and that seems to best match the semantics and intended usage of upstream k8s, based on our conversations with them. I'll open a separate PR for us to discuss that.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/API API objects and controllers cla: yes Indicates the PR's author has signed the CLA. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants