Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

volume create sample fails with "{{inputs.parameters.create-pvc-name}}" not found #1327

Closed
ryandawsonuk opened this issue May 14, 2019 · 8 comments

Comments

@ryandawsonuk
Copy link
Contributor

ryandawsonuk commented May 14, 2019

I generated the .tar.gz for the volume samples and uploaded the dag one to the pipelines UI. There are no parameters required detected but when I enable permissions and compile the volumeop_dag.py sample and submit it then I get This step is in Pending state with this message: Unschedulable: persistentvolumeclaim "{{inputs.parameters.create-pvc-name}}" not found. I can see by doing kubectl get pvc that a pvc does get created and is available to use but the first pod that tries to use the volume claim cannot be scheduled. It looks like the pod is looking for a volume claim with the name "{{inputs.parameters.create-pvc-name}}" rather than the name of the volume that was created in the previous step.

The parameter doesn't seem to be a parameter for the user to set. The pipelines UI doesn't see that there are any and when I submit the yaml to argo with -p create-pvc-name="volume-op-dag-pvc1" I just get the same error and the pvc created doesn't get the name specified in the parameter.

Not sure whether these examples are to be reviewed under #566

I compiled using 0.1.18 sdk and also with 0.1.19. Also tried volumeop_sequential.py. Same failure behaviour. A pvc is created called volumeop-sequential-gh7d4-newpvc rather than just newpvc

@ryandawsonuk ryandawsonuk changed the title volume samples fail with persistentvolumeclaim "{{inputs.parameters.create-pvc-name}}" not found volume create sample fails with "{{inputs.parameters.create-pvc-name}}" not found May 14, 2019
@ryandawsonuk ryandawsonuk reopened this May 14, 2019
@ryandawsonuk
Copy link
Contributor Author

I did think this might be a duplicate of #1303 but I tried upgrading to argo v2.3.0-rc3 (which judging by the commit times should contain the relevant patch) and I still see the same problem.

@elikatsis
Copy link
Member

elikatsis commented May 14, 2019

Hi @ryandawsonuk! Thank you for trying out this feature.

As you found out, unfortunately, Kubeflow runs with an old version of Argo but we are in the process of updating it.

[...] i tried upgrading to argo v2.3.0-rc3 (which judging by the commit times should contain the relevant patch) and I still see the same problem.

This is wierd. Let me guide you through the updating process, and what I have been doing, to make sure it is performed properly.

I have been using the latest releases argoproj/workflow-controller:v2.3.0-rc3 and argoproj/argoexec:v2.3.0-rc3.
This version provokes some pipeline bugs, though. So the update should go along with a slightly modified api-server.

  • For Argo change the deployment's image and the configmap's executorImage respectively with the ones I mentioned above, using the following commands.

    kubectl -n kubeflow edit deployments. workflow-controller
    kubectl -n kubeflow edit configmaps workflow-controller-configmap
    
  • For the api-server (ml-pipeline deployment) you could either use a custom image (I have been using elikatsis/ml-pipeline-api-server:0.1.18-pick-1289), which is based on the 0.1.18 having cherry-picked Backend - Marking auto-added artifacts as optional #1289 , or build and use an api-server image from master (which has the aforementioned PR merged, but I haven't tried out the master version). Once you have decided on the image, make the changes using the command

    kubectl -n kubeflow edit deployments. ml-pipeline
    

Would you mind trying these and report back on your results?

Edit: The custom image I refer to is now available at gcr.io/arrikto/ml-pipeline-api-server:0.1.18-pick-1289, it'd be better to use this.

@elikatsis
Copy link
Member

elikatsis commented May 14, 2019

[...] A pvc is created called volumeop-sequential-gh7d4-newpvc rather than just newpvc

I missed this earlier. This is the desired behavior. We prepend the resource_name with the workflow name, to be able to create multiple runs.

@elikatsis
Copy link
Member

/assign @elikatsis

@ryandawsonuk
Copy link
Contributor Author

That works thanks!

image

I just hadn't patched all of the relevant places

@jinchihe
Copy link
Member

@elikatsis The images for argo and ml-pipeline which are mentioned above have been updated and will be in new Kubeflow release, right? Thanks.

@elikatsis
Copy link
Member

@jinchihe, I'm actually running some tests and will be creating a PR to test the latest releases E2E.
I'm confident we can have it updated for the next Kubeflow release. Let's see how it goes!

@mmuppidi
Copy link

@jinchihe, I'm actually running some tests and will be creating a PR to test the latest releases E2E.
I'm confident we can have it updated for the next Kubeflow release. Let's see how it goes!

@elikatsis
current Kubeflow (v0.6.0-rc.0) release doesn't have it, any guess on when this would be part of official release ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants