Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bootstrapper error trying to create the PyTorch Operator #832

Closed
jlewi opened this issue May 19, 2018 · 5 comments
Closed

bootstrapper error trying to create the PyTorch Operator #832

jlewi opened this issue May 19, 2018 · 5 comments
Assignees

Comments

@jlewi
Copy link
Contributor

jlewi commented May 19, 2018

I created a newly built image of the bootstrapper

gcr.io/kubeflow-images-public/bootstrapper:v20180519-v0.1.1-57-g4c29f52f-e3b0c4

When I ran the bootstrapper (as part of #823) I got the following error

jlewi@jlewi-carbon-glaptop:~/git_kubeflow/bootstrap$ kubectl -n kubeflow-admin logs kubeflow-bootstrapper-0
{"filename":"app/server.go:191","level":"info","msg":"Using existing namespace: kubeflow","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:273","level":"info","msg":"Cluster version: v1.9.6-gke.1","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:165","level":"info","msg":"Storage class: standard","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:179","level":"info","msg":"StorageClass standard is default true","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:288","level":"info","msg":"Using K8s host https://10.55.240.1:443","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:316","level":"info","msg":"Directory /opt/bootstrap/default exists","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:345","level":"info","msg":"App already has registry kubeflow","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:368","level":"info","msg":"Installing package kubeflow/core@v0.1.0-rc.4","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:373","level":"info","msg":"Package core already exists","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:368","level":"info","msg":"Installing package kubeflow/tf-serving@v0.1.0-rc.4","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:373","level":"info","msg":"Package tf-serving already exists","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:368","level":"info","msg":"Installing package kubeflow/tf-job@v0.1.0-rc.4","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:373","level":"info","msg":"Package tf-job already exists","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:220","level":"info","msg":"Component kubeflow-core already exists","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:211","level":"info","msg":"Creating Component: pytorch-operator ...","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:217","level":"fatal","msg":"There was a problem creating protoype package kubeflow-core; error no prototype names matched 'pytorch-operator'","time":"2018-05-19T20:41:56Z"}

This looks like a versioning issue; I'm guessing the bootstrapper is using an older version of the Kubeflow registry baked into the image that doesn't include the PyTorch operator.

I think #829 might provide a work around because we could specify via config map which components to install and not install the PyTorch operator.

/priority p1
/assign @kunmi

@k8s-ci-robot
Copy link
Contributor

@jlewi: GitHub didn't allow me to assign the following users: kunmi.

Note that only kubeflow members and repo collaborators can be assigned.

In response to this:

I created a newly built image of the bootstrapper

gcr.io/kubeflow-images-public/bootstrapper:v20180519-v0.1.1-57-g4c29f52f-e3b0c4

When I ran the bootstrapper (as part of #823) I got the following error

jlewi@jlewi-carbon-glaptop:~/git_kubeflow/bootstrap$ kubectl -n kubeflow-admin logs kubeflow-bootstrapper-0
{"filename":"app/server.go:191","level":"info","msg":"Using existing namespace: kubeflow","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:273","level":"info","msg":"Cluster version: v1.9.6-gke.1","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:165","level":"info","msg":"Storage class: standard","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:179","level":"info","msg":"StorageClass standard is default true","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:288","level":"info","msg":"Using K8s host https://10.55.240.1:443","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:316","level":"info","msg":"Directory /opt/bootstrap/default exists","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:345","level":"info","msg":"App already has registry kubeflow","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:368","level":"info","msg":"Installing package kubeflow/core@v0.1.0-rc.4","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:373","level":"info","msg":"Package core already exists","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:368","level":"info","msg":"Installing package kubeflow/tf-serving@v0.1.0-rc.4","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:373","level":"info","msg":"Package tf-serving already exists","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:368","level":"info","msg":"Installing package kubeflow/tf-job@v0.1.0-rc.4","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:373","level":"info","msg":"Package tf-job already exists","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:220","level":"info","msg":"Component kubeflow-core already exists","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:211","level":"info","msg":"Creating Component: pytorch-operator ...","time":"2018-05-19T20:41:56Z"}
{"filename":"app/server.go:217","level":"fatal","msg":"There was a problem creating protoype package kubeflow-core; error no prototype names matched 'pytorch-operator'","time":"2018-05-19T20:41:56Z"}

This looks like a versioning issue; I'm guessing the bootstrapper is using an older version of the Kubeflow registry baked into the image that doesn't include the PyTorch operator.

I think #829 might provide a work around because we could specify via config map which components to install and not install the PyTorch operator.

/priority p1
/assign @kunmi

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jlewi
Copy link
Contributor Author

jlewi commented May 19, 2018

/assign @kunmingg

@kunmingg
Copy link
Contributor

Hmmm I remember we have PyTorch setup for bootstrapper e2e test...
Let me take a look.

@kunmingg
Copy link
Contributor

Bootstrapper now use master branch for registry, which contains PyTorch Operator. Close it now.

@kunmingg
Copy link
Contributor

/close

yanniszark pushed a commit to arrikto/kubeflow that referenced this issue Feb 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants