Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(v2,sdk.v2): Remove need for base image root access to write artifacts #6530

Closed
wants to merge 8 commits into from

Conversation

judahrand
Copy link
Contributor

@judahrand judahrand commented Sep 8, 2021

Description of your changes:
The changes in this PR are based on @Bobgy's design here.

Custom base images may be built specifically to avoid granting root privileges. Without these privileges the local mount points cannot be created if they live at the root of the filesystem. /gcs must remain where it is for now as it is currently a contract for KFP v2 python component wrappers. Therefore, this directory is also mounted as an emptyDir so that it already exists and is accessible to all users.

Checklist:

@google-oss-robot
Copy link

Hi @judahrand. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@judahrand judahrand changed the title fix(sdk.v2): Remove need for base image root access to read/write artifacts fix(v2.components,sdk.v2): Remove need for base image root access to read/write artifacts Sep 8, 2021
@judahrand judahrand changed the title fix(v2.components,sdk.v2): Remove need for base image root access to read/write artifacts fix(v2,sdk.v2): Remove need for base image root access to read/write artifacts Sep 8, 2021
_GCS_LOCAL_MOUNT_PREFIX = '/gcs/'
_MINIO_LOCAL_MOUNT_PREFIX = '/minio/'
_S3_LOCAL_MOUNT_PREFIX = '/s3/'
_GCS_LOCAL_MOUNT_PREFIX = '/tmp/gcs/'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be /gcs/ for GCS Fuse support on Vertex Pipelines.
The content you write to the local path /gcs/my_project/bucket/blob would be automatically synced to gs://my_project/bucket/blob.

Copy link
Contributor Author

@judahrand judahrand Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an alternate solution for being able to execute in non-root images? Or is this something for which additional compatibility is needed elsewhere.

Looking at the gcsfuse docs it looks like the mount path is arbitrary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is an implementation decision by Vertex Pipelines backend. Following that, we probably have multiple places in code where we rely on this conversation between local path and gs uri.
Another example:

return "/gcs/" + strings.TrimPrefix(uri, "gs://"), nil

Copy link
Contributor Author

@judahrand judahrand Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR does also change those.

But you're probably right that it's actually probably a problem closer to how Vertex uses gcsfuse and the file permissions which are set there in my case.

However, do you not think that trying to create these directories on at the root of the filesystem is an unnecessary hurdle in the way of non-root containers?

@judahrand
Copy link
Contributor Author

I've opened a similar issue with Vertex Pipelines:
https://issuetracker.google.com/issues/199297769

@chensun
Copy link
Member

chensun commented Sep 9, 2021

I've opened a similar issue with Vertex Pipelines:
https://issuetracker.google.com/issues/199297769

Thank you for opening the issue. I'll pass it along to the Vertex Pipelines team.

Meanwhile, I am curious is there any specific reason you want to avoid running container as root user? Or it is an image out of your control?

@judahrand
Copy link
Contributor Author

judahrand commented Sep 9, 2021

I've opened a similar issue with Vertex Pipelines:
https://issuetracker.google.com/issues/199297769

Thank you for opening the issue. I'll pass it along to the Vertex Pipelines team.

Meanwhile, I am curious is there any specific reason you want to avoid running container as root user? Or it is an image out of your control?

It is an image which we also use in our GKE cluster and our policy there is to use non root containers wherever possible. It would be possible to create a different image for Vertex... But that seems unnecessary.

@chensun
Copy link
Member

chensun commented Sep 9, 2021

I've opened a similar issue with Vertex Pipelines:
https://issuetracker.google.com/issues/199297769

Thank you for opening the issue. I'll pass it along to the Vertex Pipelines team.

Meanwhile, I am curious is there any specific reason you want to avoid running container as root user? Or it is an image out of your control?

I've opened a similar issue with Vertex Pipelines:
https://issuetracker.google.com/issues/199297769

Thank you for opening the issue. I'll pass it along to the Vertex Pipelines team.
Meanwhile, I am curious is there any specific reason you want to avoid running container as root user? Or it is an image out of your control?

It is an image which we also use in our GKE cluster and our policy there is to use non root containers wherever possible. It would be possible to create a different image for Vertex... But that seems unnecessary.

Got it.
Maybe something related

I wonder if these changes may address your concerns behind the policy of avoiding root containers.

That said, I'll still pass your concern to the team, and see if we could make a change.

@judahrand
Copy link
Contributor Author

Unfortunately, I don't set the rules 😉

I also suspect that we'll continue to see many K8s installations using the Docker shim for quite a while to come.

I appreciate that there is progress being made on the security side of containerization which may make non-root less important but it still just strikes me as an odd failure mode (especially as it isn't documented).

Thank you very much for passing the issue along - I can't imagine a small file permission tweak is that big a change to enable non-root 😛

@Bobgy
Copy link
Contributor

Bobgy commented Sep 10, 2021

FYI, there were a similar request raised for v2 compatible mode: #5673

@judahrand
Copy link
Contributor Author

judahrand commented Sep 10, 2021

FYI, there were a similar request raised for v2 compatible mode: #5673

Ah! I missed this - might as well close this in that case.

Or alternatively I can use this PR to have a go at making your suggested changes if you think they are likely to be mergable? Though, I might need some pointers of where to look to implement the changes (eg. where is the container spec defined)

@judahrand
Copy link
Contributor Author

judahrand commented Sep 10, 2021

Here is where some some these changes are needed I think?

func containerExecutorTemplate(container *pipelinespec.PipelineDeploymentConfig_PipelineContainerSpec, launcherImage string) *wfapi.Template {

@judahrand
Copy link
Contributor Author

@Bobgy I've altered this PR to try to tackle #5673. I'd appreciate any feedback - I may well be going in very much the wrong direction?

@judahrand judahrand force-pushed the feat/non-root-artifact branch 2 times, most recently from fa0f00b to 217c3d8 Compare September 10, 2021 15:39
@judahrand judahrand marked this pull request as draft September 10, 2021 21:44
@Bobgy
Copy link
Contributor

Bobgy commented Nov 2, 2021

/test kubeflow-pipelines-samples-v2

@judahrand
Copy link
Contributor Author

Is there anything else here that needs to be done to make this mergeable?

@capri-xiyue
Copy link
Contributor

/lgtm
/approve

@google-oss-prow google-oss-prow bot added the lgtm label Nov 8, 2021
@judahrand
Copy link
Contributor Author

/assign @chensun

@Bobgy
Copy link
Contributor

Bobgy commented Nov 15, 2021

Thank you @judahrand!
/lgtm

@chensun
Copy link
Member

chensun commented Nov 15, 2021

/approve

Thanks!

@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: capri-xiyue, chensun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@chensun chensun added the cherrypick-approved area OWNER approves to cherry pick this PR to current active release branch label Nov 15, 2021
@judahrand
Copy link
Contributor Author

/retest

@capri-xiyue capri-xiyue removed their request for review December 24, 2021 00:04
@capri-xiyue capri-xiyue removed their assignment Dec 24, 2021
@jie8357IOII
Copy link

jie8357IOII commented Feb 9, 2022

Does there have any updates?

@Bobgy
Copy link
Contributor

Bobgy commented Feb 10, 2022

/assign @chensun @zijianjoy

@Bobgy Bobgy removed their assignment Feb 10, 2022
@dandawg
Copy link

dandawg commented Apr 14, 2022

This looks to be close to the finish line. What work is still needed to get it across, and do we have a timeline?

@zijianjoy
Copy link
Collaborator

Hello @judahrand , would you like to resolve the merge conflict and run test again? This PR has been approved so we don't expect much changes required beyond that.

@google-oss-prow
Copy link

@judahrand: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
kubeflow-pipelines-v2-go-test 729c7ce link true /test kubeflow-pipelines-v2-go-test
kubeflow-pipelines-sdk-isort 729c7ce link unknown /test kubeflow-pipelines-sdk-isort
kubeflow-pipelines-sdk-yapf 729c7ce link unknown /test kubeflow-pipelines-sdk-yapf
kubeflow-pipelines-sdk-python310 729c7ce link unknown /test kubeflow-pipelines-sdk-python310
kubeflow-pipelines-sdk-python38 729c7ce link unknown /test kubeflow-pipelines-sdk-python38
kubeflow-pipelines-sdk-python311 729c7ce link unknown /test kubeflow-pipelines-sdk-python311
test-upgrade-kfp-sdk 729c7ce link unknown /test test-upgrade-kfp-sdk
kubeflow-pipelines-component-yaml 729c7ce link unknown /test kubeflow-pipelines-component-yaml
test-run-all-gcpc-modules 729c7ce link unknown /test test-run-all-gcpc-modules
kubeflow-pipelines-sdk-python39 729c7ce link unknown /test kubeflow-pipelines-sdk-python39
kubeflow-pipelines-sdk-docformatter 729c7ce link unknown /test kubeflow-pipelines-sdk-docformatter
kubeflow-pipelines-sdk-python37 729c7ce link unknown /test kubeflow-pipelines-sdk-python37

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@rimolive
Copy link
Member

/close

@google-oss-prow google-oss-prow bot closed this Apr 24, 2024
Copy link

@rimolive: Closed this PR.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved cherrypick-approved area OWNER approves to cherry pick this PR to current active release branch cla: yes lgtm ok-to-test size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants