Build Pipeline leveraging Arena #1058

cheyang · 2019-03-28T07:46:20Z

The sample pipeline runs preparing data, source code, training and exporting a Tensorflow model with MNIST handwriting recognition using Arena. It provides arena_launcher and API to make the user easy to use pipelines to train the specific training, such as MPI Job, TensorFlow Estimator Job.

This change is

Ark-kun · 2019-03-28T21:27:31Z

components/arena/python/arena/_arena_dist_tf.py

+
+# def DistributeTFOp(name, image, gpus: int, ):
+
+class DistributeTFOp(dsl.ContainerOp):


def arena_distribute_tf_op(....): return MPIOp(.... )

Why some samples use RandomNumOp directly? I'm wondering the principles. Thanks for your advises? https://github.com/kubeflow/pipelines/blob/master/samples/basic/condition.py#L20

with dsl.Condition(flip.output == 'tails'): random_num_tail = RandomNumOp(10, 19) with dsl.Condition(random_num_tail.output > 15):

Sorry for the confusing. Those samples are the oldest and have not been updated for a while.
Other samples are a bit more modern: https://github.com/kubeflow/pipelines/blob/master/samples/kubeflow-tf/kubeflow-training-classification.py

Your code (ineriting from ContainerOp vs returning ContainerOp) is not wrong. It's just outdated a bit.

Thank you, I will update with your suggestions.

Ark-kun · 2019-03-28T21:28:11Z

components/arena/python/arena/_arena_mpi_op.py

+
+# def arena_submit_standalone_job_op(name, image, gpus: int, ):
+
+class MPIOp(dsl.ContainerOp):


def arena_launch_mpi_op(....): return ContainerOp(.... )

components/arena/python/arena/_arena_mpi_op.py

Ark-kun · 2019-03-28T21:33:55Z

components/arena/python/arena/_arena_standalone_op.py

+# def arena_submit_standalone_job_op(name, image, gpus: int, ):
+
+class StandaloneOp(dsl.ContainerOp):
+  """Submit standalone Job."""


Can you add more comprehensive docstring?

Sure, I will update.

cheyang · 2019-03-30T03:10:22Z

@Ark-kun Please take a look again.

Ark-kun · 2019-04-03T01:36:23Z

samples/arena-samples/standalonejob/standalone_pipeline.py

+  except:
+    experiment_id = client.create_experiment(EXPERIMENT_NAME).id
+  run = client.run_pipeline(experiment_id, RUN_ID, __file__ + '.tar.gz',
+                            params={'learning_rate':learning_rate,


AFAIK, there is currently a bug/inconsistency that forces you to use - in the arguments here instead of _. Have you tried to run your sample?

Yeah, but I've tested it with pipeline in kubeflow 0.4.0, not with the latest pipeline.

I already tested with kfp 0.1.4. Looks good.

Ark-kun · 2019-04-03T01:36:56Z

samples/arena-samples/standalonejob/standalone_pipeline.py

+  name='pipeline to run jobs',
+  description='shows how to run pipeline jobs.'
+)
+def sample_pipeline(learning_rate=dsl.PipelineParam(name='learning_rate',


You can just use learning_rate='0.01'

cheyang · 2019-04-11T12:19:10Z

/retest

cheyang · 2019-04-11T12:21:46Z

@Ark-kun , please take a look again when you have time. Thanks.

googlebot · 2019-04-12T11:49:06Z

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and have the pull request author add another comment and the bot will run again. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

fix the length of name fix the length of name change the default image

googlebot · 2019-04-12T14:01:21Z

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

cheyang · 2019-04-15T21:48:32Z

/ping @Ark-kun

Ark-kun · 2019-04-16T00:56:44Z

/lgtm

Ark-kun · 2019-04-16T01:07:18Z

Sorry for a delayed response. I wanted to make sure that this sample follows our directory organization for contributed samples.

I'm a bit concerned about the Arena sample being the first one the user sees when they enter the samples directory. I feel that the sample is quiet specialized and might not be the first sample that the user should see.

We can probably fix that later so I do not see a reason for holding this PR any longer.
/cc @gaoning777 @hongye-sun @vicaire @IronPan
/approve

k8s-ci-robot · 2019-04-16T01:07:23Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Ark-kun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~components/OWNERS~~ [Ark-kun]
~~samples/OWNERS~~ [Ark-kun]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2019-04-16T01:07:23Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Ark-kun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~components/OWNERS~~ [Ark-kun]
~~samples/OWNERS~~ [Ark-kun]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

cheyang added 2 commits March 28, 2019 15:26

add arena samples to pipelines

f21208a

update directory

36959c2

k8s-ci-robot requested review from Ark-kun, hongye-sun and gaoning777 March 28, 2019 07:46

k8s-ci-robot added the size/XL label Mar 28, 2019

Ark-kun reviewed Mar 28, 2019

View reviewed changes

components/arena/python/arena/_arena_mpi_op.py Outdated Show resolved Hide resolved

Ark-kun reviewed Mar 28, 2019

View reviewed changes

cheyang added 2 commits March 29, 2019 15:42

update docs

c942dac

update docs

8e6904d

Ark-kun reviewed Apr 3, 2019

View reviewed changes

cheyang added 9 commits April 4, 2019 14:04

update func name according to comments

2108b1d

update samples

9c7e94d

update samples

5f1dae0

update samples

77acc87

add installation guide

6b02b92

add installation guide

3129c43

update kfp package

f9a1608

update author name

c72c5f1

change timeout unit to hour

b66ac63

k8s-ci-robot added size/XXL and removed size/XL labels Apr 6, 2019

cheyang added 4 commits April 6, 2019 15:42

update API docs

2f5eea3

reduce image size

23fc1f8

update samples

053398d

update docker images

df00d6c

cheyang force-pushed the integrate_arena_into_pipelines branch from afa1800 to 446b20f Compare April 12, 2019 13:59

add mpi op

b83336e

fix the length of name fix the length of name change the default image

cheyang force-pushed the integrate_arena_into_pipelines branch from 446b20f to b83336e Compare April 12, 2019 14:01

cheyang added 10 commits April 13, 2019 10:14

fix typo of metric

989400f

update api version

0e1b1d3

fix metric name

5c98044

fix metric name

1cb6229

fix metric name

6f03374

fix metric name

52dd513

fix metric name

3615e1b

fix metric name

16a24ef

make it show in pipeline ui

a1a0ae9

make it show in pipeline ui

1ddc00e

cheyang added 2 commits April 16, 2019 06:14

update demos

2bc52fb

update demos

b8ab8e9

k8s-ci-robot assigned Ark-kun Apr 16, 2019

k8s-ci-robot added the lgtm label Apr 16, 2019

k8s-ci-robot requested review from IronPan and vicaire April 16, 2019 01:07

k8s-ci-robot added the approved label Apr 16, 2019

k8s-ci-robot merged commit 6806f83 into kubeflow:master Apr 16, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build Pipeline leveraging Arena #1058

Build Pipeline leveraging Arena #1058

cheyang commented Mar 28, 2019 •

edited by jlewi

Loading

Ark-kun Mar 28, 2019

cheyang Mar 29, 2019 •

edited

Loading

Ark-kun Apr 3, 2019

cheyang Apr 3, 2019

Ark-kun Mar 28, 2019

Ark-kun Mar 28, 2019

cheyang Mar 29, 2019

cheyang commented Mar 30, 2019

Ark-kun Apr 3, 2019

cheyang Apr 3, 2019

cheyang Apr 6, 2019

Ark-kun Apr 3, 2019

cheyang Apr 6, 2019

cheyang commented Apr 11, 2019

cheyang commented Apr 11, 2019

googlebot commented Apr 12, 2019

googlebot commented Apr 12, 2019

cheyang commented Apr 15, 2019

Ark-kun commented Apr 16, 2019

Ark-kun commented Apr 16, 2019

k8s-ci-robot commented Apr 16, 2019

k8s-ci-robot commented Apr 16, 2019


		# def DistributeTFOp(name, image, gpus: int, ):

		class DistributeTFOp(dsl.ContainerOp):


		# def arena_submit_standalone_job_op(name, image, gpus: int, ):

		class MPIOp(dsl.ContainerOp):

Build Pipeline leveraging Arena #1058

Build Pipeline leveraging Arena #1058

Conversation

cheyang commented Mar 28, 2019 • edited by jlewi Loading

Choose a reason for hiding this comment

cheyang Mar 29, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cheyang commented Mar 30, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cheyang commented Apr 11, 2019

cheyang commented Apr 11, 2019

googlebot commented Apr 12, 2019

googlebot commented Apr 12, 2019

cheyang commented Apr 15, 2019

Ark-kun commented Apr 16, 2019

Ark-kun commented Apr 16, 2019

k8s-ci-robot commented Apr 16, 2019

k8s-ci-robot commented Apr 16, 2019

cheyang commented Mar 28, 2019 •

edited by jlewi

Loading

cheyang Mar 29, 2019 •

edited

Loading