Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend Katib API for NAS jobs #327

Merged
merged 45 commits into from
Jan 29, 2019

Conversation

andreyvelich
Copy link
Member

@andreyvelich andreyvelich commented Jan 17, 2019

Fixes: #293
Related to: #257


This change is Reviewable

@andreyvelich
Copy link
Member Author

Is this PR:

  1. I extend API for structure of NAS yaml file.
  2. Make blank NAS Reinforcement Learning service suggestion for future improvement
  3. Extend manager with JobType functionality. We have to create our own tables for NAS config and NAS suggestion. We have to divide HP jobs and NAS jobs in the manager.
  4. I am using the same StudyConfig to parse information from StudyJob. I extend StudyConfig structure and add new functionality inside getStudyConf function in katib_api_util.
  5. Add new type in ParameterType
    Please, review it.
    @YujiOshima @hougangliu @jlewi @johnugeorge @xyhuang @ddutta

cmd/manager/main.go Outdated Show resolved Hide resolved
cmd/suggestion/nasrl/Dockerfile Show resolved Hide resolved
cmd/suggestion/nasrl/main.py Show resolved Hide resolved
examples/nasjob-example-RL.yaml Outdated Show resolved Hide resolved
pkg/suggestion/nasrl_service.py Outdated Show resolved Hide resolved
@gaocegege
Copy link
Member

gaocegege commented Jan 17, 2019

Awesome!

Is it ready to review?

@andreyvelich
Copy link
Member Author

@gaocegege Yes, you can review it, please.

cmd/manager/main.go Outdated Show resolved Hide resolved
cmd/manager/main.go Outdated Show resolved Hide resolved
@Akado2009
Copy link
Contributor

Akado2009 commented Jan 17, 2019

Everyone and @gaocegege in particular, sorry for the inconvenience, but this PR should be labeled as WIP as it heavily depends: #328, which actually introduces additional APIs to delete NAS job/create them. So we insist this PR should be merged before this one: #328

@andreyvelich
Copy link
Member Author

andreyvelich commented Jan 24, 2019

I made some changes inside metrics-collector and added function "GetStudyJobType" inside protobuf, please check it @hougangliu @YujiOshima @gaocegege
https://github.com/kubeflow/katib/pull/327/files#diff-82bfd7279d5e08ab134ad69029e09925R73

@andreyvelich
Copy link
Member Author

As @YujiOshima said, right now we remove function GetStudyJobType from API and spit between NAS and HP config inside function GetStudy in manager. You can find it here: #328

@andreyvelich
Copy link
Member Author

I think we finished Katib API changes.
I will remove WIP status and I am ready for the final review before merge this PR.
@YujiOshima @hougangliu @gaocegege @jlewi @johnugeorge

@andreyvelich andreyvelich changed the title [WIP] Extend Katib API for NAS jobs Extend Katib API for NAS jobs Jan 26, 2019
@hougangliu
Copy link
Member

/lgtm

kind: Job
metadata:
name: {{.WorkerID}}
namespace: kubeflow
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/kubeflow/{{.NameSpace}}

@YujiOshima
Copy link
Contributor

@andreyvelich Thank you!! LGTM except for @hougangliu 's comment.

@k8s-ci-robot k8s-ci-robot removed the lgtm label Jan 28, 2019
@andreyvelich
Copy link
Member Author

@hougangliu @YujiOshima
Thanks! I changed it.

@YujiOshima
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: YujiOshima

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit f11c13e into kubeflow:master Jan 29, 2019
@andreyvelich andreyvelich deleted the 293-extend-sj-structure branch October 6, 2021 00:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants