Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Select entrypoint command based on runtime platform #4420

Merged
merged 4 commits into from
Jan 4, 2022

Conversation

imjasonh
Copy link
Member

@imjasonh imjasonh commented Dec 14, 2021

This fixes a long-standing bug affecting heterogenous clusters, where the controller's platform would be used to lookup the image's entrypoint, instead of the platform of the node where the workload would eventually run.

With this change, the controller looks up all the image's entrypoints and passes them to the entrypoint binary on the node, where it uses its current runtime platform to lookup the correct entrypoint to execute.

This has the added benefit that we can now pass the entire image@digest of the multi-platform image down to the Pod, instead of the (controller's) platform-specific image. This has benefits for scenarios where Pods may be blocked from running unsigned/untrusted images, since it might be the multi-platform image index that's signed/trusted, and not any particular platform-specific constituent image.

NB: There is still a small bug here with Windows images specifically. If a multi-platform image contains multiple entries for the same OS+arch+variant, but with different osversion, like golang for instance, this change may not select the right command, and may instead take the last matching platform's command instead of the "correct" one for the runtime platform. When that happens, the "correct" image will be selected for the runtime node's platform (taking osversion into account), but will use the "incorrect" image's command. In practice, every image I could find that's in this state has the same command for all osversions, so it's moot. But it's possible a multi-platform image will tickle this bug. The "good" news is that this bug also existed in the previous incarnation of the code -- the code would select the command for the first matching platform (not taking into account variant or osversion), then would pass it to the runtime node which may expect a possibly-different command. To fix this bug, you'd need to include the osversion in the command map key, and (somehow) detect the node's osversion from the entrypoint binary at runtime, to select the correct platform (including osversion). In practice AFAIK this never happened, but just in case, it's worth calling out while I'm still thinking about it, in case something fishy happens on Windows clusters in the future, this comment might be useful to someone. If you're reading this and nodding in excitement that I might be describing your bug, may god have mercy on your soul.

Fixes #4299
/kind bug

Submitter Checklist

As the author of this PR, please check off the items in this checklist:

  • Docs included if any changes are user facing
  • Tests included if any functionality added or changed
  • Follows the commit message standard
  • Meets the Tekton contributor standards (including
    functionality, content, code)
  • Release notes block below has been filled in or deleted (only if no user facing changes)

Release Notes

Changes the way image commands are passed to the entrypoint executor, enabling more correct behavior in heterogeneous clusters, and allowing for multi-platform image references to be passed to generated Pods.

cc @mattmoor

@imjasonh imjasonh requested a review from vdemeester December 14, 2021 04:21
@tekton-robot tekton-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. labels Dec 14, 2021
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Dec 14, 2021

CLA Not Signed

@tekton-robot tekton-robot requested a review from dibyom December 14, 2021 04:21
@tekton-robot tekton-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Dec 14, 2021
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pod/entrypoint.go 86.8% 87.8% 1.0
pkg/pod/entrypoint_lookup.go 79.4% 85.7% 6.3

@imjasonh imjasonh force-pushed the entrypoint-platform branch from e208b39 to f30d280 Compare December 14, 2021 04:25
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pod/entrypoint.go 86.8% 87.8% 1.0
pkg/pod/entrypoint_lookup.go 79.4% 85.7% 6.3

@tekton-robot tekton-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 14, 2021
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/entrypoint/entrypointer.go 71.0% 70.1% -0.9
pkg/pod/entrypoint.go 86.8% 87.8% 1.0
pkg/pod/entrypoint_lookup.go 79.4% 85.7% 6.3

@tekton-robot tekton-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Dec 15, 2021
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/entrypoint/entrypointer.go 71.0% 70.1% -0.9
pkg/pod/entrypoint.go 86.8% 87.8% 1.0
pkg/pod/entrypoint_lookup.go 79.4% 85.7% 6.3

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/entrypoint/entrypointer.go 71.0% 70.1% -0.9
pkg/pod/entrypoint.go 86.8% 88.2% 1.3
pkg/pod/entrypoint_lookup.go 79.4% 88.0% 8.6

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/entrypoint/entrypointer.go 71.0% 70.1% -0.9
pkg/pod/entrypoint.go 86.8% 88.0% 1.2
pkg/pod/entrypoint_lookup.go 79.4% 88.0% 8.6

@imjasonh imjasonh force-pushed the entrypoint-platform branch from aafc270 to 0959669 Compare December 15, 2021 04:15
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/entrypoint/entrypointer.go 71.0% 70.1% -0.9
pkg/pod/entrypoint.go 86.8% 88.0% 1.2
pkg/pod/entrypoint_lookup.go 79.4% 88.0% 8.6

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/entrypoint/entrypointer.go 71.0% 70.1% -0.9
pkg/pod/entrypoint.go 86.8% 88.0% 1.2
pkg/pod/entrypoint_lookup.go 79.4% 88.0% 8.6

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/entrypoint/entrypointer.go 71.0% 70.1% -0.9
pkg/pod/entrypoint.go 86.8% 88.0% 1.2
pkg/pod/entrypoint_lookup.go 79.4% 88.0% 8.6

@imjasonh
Copy link
Member Author

/test check-pr-has-kind-label

@afrittoli afrittoli closed this Dec 15, 2021
@afrittoli afrittoli reopened this Dec 15, 2021
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/entrypoint/entrypointer.go 71.0% 70.1% -0.9
pkg/pod/entrypoint.go 86.8% 88.0% 1.2
pkg/pod/entrypoint_lookup.go 79.4% 88.0% 8.6

@WillsonHG
Copy link

/easycla

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/entrypoint/entrypointer.go 71.0% 70.1% -0.9
pkg/pod/entrypoint.go 86.8% 88.0% 1.2
pkg/pod/entrypoint_lookup.go 79.4% 88.0% 8.6

@imjasonh imjasonh force-pushed the entrypoint-platform branch from 18a2c99 to e7d02c9 Compare December 20, 2021 17:02
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/entrypoint/entrypointer.go 71.0% 70.1% -0.9
pkg/pod/entrypoint.go 86.8% 88.0% 1.2
pkg/pod/entrypoint_lookup.go 79.4% 88.0% 8.6

@imjasonh imjasonh added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Dec 22, 2021
@imjasonh imjasonh closed this Jan 3, 2022
@imjasonh imjasonh reopened this Jan 3, 2022
@imjasonh
Copy link
Member Author

imjasonh commented Jan 3, 2022

I've managed to appease EasyCLA, just needs an approval @vdemeester @afrittoli 🙏

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/entrypoint/entrypointer.go 71.0% 70.1% -0.9
pkg/pod/entrypoint.go 86.8% 88.0% 1.2
pkg/pod/entrypoint_lookup.go 79.4% 85.7% 6.3

@mattmoor
Copy link
Member

mattmoor commented Jan 3, 2022

/test check-pr-has-kind-label

This also needs a kick and I'm not sure it'll listen to me, but YOLO

/lgtm

@imjasonh
Copy link
Member Author

imjasonh commented Jan 3, 2022

/test check-pr-has-kind-label

1 similar comment
@vdemeester
Copy link
Member

/test check-pr-has-kind-label

Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit comment on a comment, otherwise, looking very good 😬
/lgtm

@tekton-robot tekton-robot removed the lgtm Indicates that a PR is ready to be merged. label Jan 4, 2022
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/entrypoint/entrypointer.go 71.0% 70.1% -0.9
pkg/pod/entrypoint.go 86.8% 88.0% 1.2
pkg/pod/entrypoint_lookup.go 79.4% 85.7% 6.3

@vdemeester
Copy link
Member

/lgtm

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 4, 2022
@imjasonh
Copy link
Member Author

imjasonh commented Jan 4, 2022

/test check-pr-has-kind-label

@dlorenc
Copy link
Contributor

dlorenc commented Jan 4, 2022

/approve

@tekton-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dlorenc, mattmoor

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tekton-robot tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 4, 2022
@vdemeester vdemeester removed the kind/bug Categorizes issue or PR as related to a bug. label Jan 4, 2022
@imjasonh
Copy link
Member Author

imjasonh commented Jan 4, 2022

/kind bug

@tekton-robot tekton-robot added the kind/bug Categorizes issue or PR as related to a bug. label Jan 4, 2022
@tekton-robot tekton-robot merged commit b222bf8 into tektoncd:main Jan 4, 2022
@tekton-robot
Copy link
Collaborator

@imjasonh: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-tekton-pipeline-alpha-integration-tests 81691e9 link /test pull-tekton-pipeline-alpha-integration-tests

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

1 similar comment
@tekton-robot
Copy link
Collaborator

@imjasonh: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-tekton-pipeline-alpha-integration-tests 81691e9 link /test pull-tekton-pipeline-alpha-integration-tests

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cla: yes Trying to make the CLA bot happy with ppl from different companies work on one commit cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tekton entrypoint resolution can break signature verification.
7 participants