Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(backend): OutPutPath dir creation mode Fixes #7629 #9946

Merged
merged 1 commit into from
Sep 22, 2023

Conversation

rvadim
Copy link
Contributor

@rvadim rvadim commented Aug 29, 2023

Description of your changes:
Try to fix #7629

Print of ls -la after kfp-launcher created a dir

total 288
drwxrwxrwt 1 root root   4096 Aug 29 20:47 .
drwxr-xr-x 1 root root   4096 Aug 29 20:47 ..
drw-r--r-- 2 user user   4096 Aug 29 20:47 kfp <------ 0644 mode
-rw------- 1 root root 278952 Aug 16 20:34 tmp7xutvpq1
[KFP Executor 2023-08-29 20:47:56,221 INFO]: Looking for component `square` in --component_module_path `/tmp/tmp.Y2OeYEVRGv/ephemeral_component.py`
[KFP Executor 2023-08-29 20:47:56,221 INFO]: Loading KFP component "square" from /tmp/tmp.Y2OeYEVRGv/ephemeral_component.py (directory "/tmp/tmp.Y2OeYEVRGv" and module name "ephemeral_component")
[KFP Executor 2023-08-29 20:47:56,222 INFO]: Got executor_input:
{
    "inputs": {
        "parameterValues": {
            "x": 2
        }
    },
    "outputs": {
        "parameters": {
            "Output": {
                "outputFile": "/tmp/kfp/Output"
            }
        },
        "outputFile": "/tmp/kfp/output_metadata.json"
    }
}
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/local/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/user/.local/lib/python3.7/site-packages/kfp/components/executor_main.py", line 105, in <module>
    executor_main()
  File "/home/user/.local/lib/python3.7/site-packages/kfp/components/executor_main.py", line 101, in executor_main
    executor.execute()
  File "/home/user/.local/lib/python3.7/site-packages/kfp/components/executor.py", line 347, in execute
    self._write_executor_output(result)
  File "/home/user/.local/lib/python3.7/site-packages/kfp/components/executor.py", line 297, in _write_executor_output
    with open(executor_output_path, 'w') as f:
PermissionError: [Errno 13] Permission denied: '/tmp/kfp/output_metadata.json'

Maybe it is not complete fix, need an advise.

@google-oss-prow
Copy link

Hi @rvadim. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@google-cla
Copy link

google-cla bot commented Aug 29, 2023

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@zijianjoy
Copy link
Collaborator

/assign @chensun

@rvadim
Copy link
Contributor Author

rvadim commented Sep 1, 2023

I have built it and tested with mutation webhook for changing container image, and it works fine after fix.

@chensun
Copy link
Member

chensun commented Sep 1, 2023

I have built it and tested with mutation webhook for changing container image, and it works fine after fix.

I'm still puzzled who would the execution bits matters for file read&write. Is there anything special to the base image you're using? This error doesn't reproduce to all pipelines runs, and I'm curious what's the trigger point here.

@chensun
Copy link
Member

chensun commented Sep 1, 2023

/ok-to-test

@rvadim
Copy link
Contributor Author

rvadim commented Sep 4, 2023

I have tested with code below(based on https://pkg.go.dev/os#MkdirAll):

package main

import (
	"fmt"
	"log"
	"os"
)

func main() {
  
	err := os.MkdirAll("/tmp/test/subdir", 0644)
	if err != nil {
		log.Fatal(err)
	}
        fmt.Println("/tmp/test/subdir created")
        
	err = os.WriteFile("/tmp/test/subdir/testfile.txt", []byte("Hello, Gophers!"), 0644)
	if err != nil {
		log.Fatal(err)
	}
        fmt.Println("/tmp/test/subdir/testfile.txt created")
}

For me, it looks like mkdirall recursively try to create /tmp/test/subdir, on first iteration it is successful and /tmp/test get 0644 permissions, next it call syscall.Mkdir(longName, syscallMode(perm)) with /tmp/test/subdir as longName and no matter what perm, because siscall fails with EACCESS(13).

$ rm -R /tmp/test/
$ ls -al /tmp/test
ls: cannot access '/tmp/test': No such file or directory
$ go run test.go 
2023/09/04 10:47:29 mkdir /tmp/test/subdir: permission denied
exit status 1
$ ls -la /tmp/test/
ls: cannot access '/tmp/test/.': Permission denied
ls: cannot access '/tmp/test/..': Permission denied
total 0
d????????? ? ? ? ?            ? .
d????????? ? ? ? ?            ? ..

MkdirAll behavior is differ from mkdir CLI because mkdir creats intermediate directories with 775 mode:

$ rm -R /tmp/test/
$ mkdir -p -m 0644 /tmp/test/subdir
$ ls -al /tmp/test/
total 72
drwxrwxr-x  3 vadim vadim  4096 сен  4 11:04 . <- 775
drwxrwxrwt 34 root  root  61440 сен  4 11:04 ..
drw-r--r--  2 vadim vadim  4096 сен  4 11:04 subdir <- 644

BUT, you could emulate MkdirAll behavior like this:

$ mkdir -p -m0644 /tmp/test/
$ ls -la /tmp/test/
ls: cannot access '/tmp/test/.': Permission denied
ls: cannot access '/tmp/test/..': Permission denied
total 0
d????????? ? ? ? ?            ? .
d????????? ? ? ? ?            ? ..
$ mkdir -p -m0644 /tmp/test/subdir
mkdir: cannot create directory ‘/tmp/test’: Permission denied

The execute bit (x) allows the affected user to enter the directory, and access files and directories inside

So, it does no matter MkdirAll can create subdir or not user will unable to write files to directories without execute bit.

Of cause, under root it will work due to skip mode check.

$ sudo ls -la /tmp/test/
total 68
drw-r-----  2 vadim vadim  4096 сен  4 10:47 .
drwxrwxrwt 37 root  root  61440 сен  4 10:48 ..

@rvadim
Copy link
Contributor Author

rvadim commented Sep 4, 2023

/retest

@rvadim
Copy link
Contributor Author

rvadim commented Sep 4, 2023

Hi! Also I have no idea how my changes trigger problems with dependencies:

git archive --format=tar "${stash:-HEAD}" | gzip >backend/src/v2/test/tmp/context.tar.gz
python -u scripts/upload_gcs_blob.py tmp/context.tar.gz gs://kfp-ci/8c6e64e8df17f6a0b93571a07c6f3133a5da6aa2/v2-sample-test/src/context.tar.gz
File tmp/context.tar.gz uploaded to destination gs://kfp-ci/8c6e64e8df17f6a0b93571a07c6f3133a5da6aa2/v2-sample-test/src/context.tar.gz
export KF_PIPELINES_ENDPOINT=https://4e18c21c9d33d20f-dot-datalab-vm-staging.googleusercontent.com/ \
	&& python -u sample_test.py \
	--samples_config samples/test/config.yaml \
	--context gs://kfp-ci/8c6e64e8df17f6a0b93571a07c6f3133a5da6aa2/v2-sample-test/src/context.tar.gz \
	--gcs_root gs://kfp-ci/8c6e64e8df17f6a0b93571a07c6f3133a5da6aa2/v2-sample-test/data \
	--gcr_root gcr.io/kfp-ci/8c6e64e8df17f6a0b93571a07c6f3133a5da6aa2/v2-sample-test \
	--kfp_package_path "'git+https://github.com/kubeflow/pipelines@refs/pull/9946/merge#egg=kfp&subdirectory=sdk/python'"
Traceback (most recent call last):
  File "sample_test.py", line 21, in <module>
    import kfp.deprecated as kfp
  File "/usr/local/lib/python3.7/site-packages/kfp/deprecated/__init__.py", line 19, in <module>
    from . import dsl
  File "/usr/local/lib/python3.7/site-packages/kfp/deprecated/dsl/__init__.py", line 16, in <module>
    from ._pipeline import Pipeline, PipelineExecutionMode, pipeline, get_pipeline_conf, PipelineConf
  File "/usr/local/lib/python3.7/site-packages/kfp/deprecated/dsl/_pipeline.py", line 21, in <module>
    from kfp.deprecated.dsl import _component_bridge
  File "/usr/local/lib/python3.7/site-packages/kfp/deprecated/dsl/_component_bridge.py", line 29, in <module>
    from kfp.deprecated.dsl import component_spec as dsl_component_spec
  File "/usr/local/lib/python3.7/site-packages/kfp/deprecated/dsl/component_spec.py", line 21, in <module>
    from kfp.deprecated.dsl import type_utils
  File "/usr/local/lib/python3.7/site-packages/kfp/deprecated/dsl/type_utils.py", line 19, in <module>
    from kfp.components.types import type_utils
ModuleNotFoundError: No module named 'kfp.components.types'
make: *** [Makefile:18: sample-test] Error 1

Also, as far as I see all latest commits in master fail to pass test.

@khituras
Copy link

khituras commented Sep 8, 2023

I have built it and tested with mutation webhook for changing container image, and it works fine after fix.

I'm still puzzled who would the execution bits matters for file read&write. Is there anything special to the base image you're using? This error doesn't reproduce to all pipelines runs, and I'm curious what's the trigger point here.

The execution bit on a directory is required to enter the directory for any reason, including the creation of new files. So it is not surprising that one gets the Permission Denied error. I find it more curious why this ever worked - if it has.

We have this problem with our Container Components. This is currently a blocker for us, it would be great if this could be fixed.

@khituras
Copy link

khituras commented Sep 8, 2023

Hi! Also I have no idea how my changes trigger problems with dependencies:

I think there is something broken in kfp.deprecated>=2.1.2. I get the same error with kfp==2.1.3 when I try import kfp.deprecated as kfp.
This is used in sample_test.py hence the error. Unfortunately, I do not know the best intended way to fix that. I guess I would try to migrate sample_test.py.

@chensun
Copy link
Member

chensun commented Sep 22, 2023

Sample test doesn't block merge at this moment. Upgrading sample test infra is tracked by #9487

Copy link
Member

@chensun chensun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

Thanks!

@google-oss-prow google-oss-prow bot added the lgtm label Sep 22, 2023
@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: chensun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow
Copy link

google-oss-prow bot commented Sep 22, 2023

@rvadim: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
kubeflow-pipelines-samples-v2 f6f0a9d link false /test kubeflow-pipelines-samples-v2

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@chensun
Copy link
Member

chensun commented Sep 22, 2023

/test kubeflow-pipeline-backend-test

@google-oss-prow google-oss-prow bot merged commit 4003e56 into kubeflow:master Sep 22, 2023
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[backend] OutputPath is giving "permission denied" - why?
4 participants