Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix typo error for MinikubeDemo #282

Merged
merged 1 commit into from
Dec 11, 2018
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 17 additions & 17 deletions examples/MinikubeDemo.md
Original file line number Diff line number Diff line change
Expand Up @@ -170,11 +170,11 @@ You can set any configuration for your worker pods.
Here, try to set config for GPU.
The manifest of the worker pods are generated from a template.
The templates are defined in [ConfigMap](./workerConfigMap.yaml).
There are two templates, defaultWorkerTemplate.yaml and gpuWorkerTemplate.yaml.
There are two templates: defaultWorkerTemplate.yaml and gpuWorkerTemplate.yaml.
You can add your template for worker.
Then you should specify the template in your studyjob spec.
[This](/examples/gpu-example.yaml) is example for using `gpuWorkerTemplate.yaml`.
Set "/worker-template/gpuWorkerTemplate.yaml at `workerTemplatePath` field and specify gpu number at `workerParameters/Gpu`
[This example](/examples/gpu-example.yaml) uses `gpuWorkerTemplate.yaml`.
Set "/worker-template/gpuWorkerTemplate.yaml" at `workerTemplatePath` field and specify gpu number at `workerParameters/Gpu`.
You can apply it same as other examples.
```
$ kubectl apply -f gpu-example.yaml
Expand Down Expand Up @@ -241,7 +241,7 @@ Status:
Workerid: x8f172a64645690e
```

Check the GPU configuration works correctly.
Check if the GPU configuration works correctly.

```
$ kubectl -n kubeflow describe pod g07cba174ada521e-88wpn
Expand Down Expand Up @@ -298,11 +298,11 @@ Events:

### Default Metrics Collector

The default metrics will be collect from the StdOut of workers.
It is deploy as a cronjob. It will collect and report metrics periodically.
It collect metrics through k8s pod log API.
You should print logs {metrics name}={value} style.
In the above demo, the objective value name is Validation-accuracy and the metrics are accuracy, your training code should print like this.
The default metrics collector collects metrics from the StdOut of workers.
It is deployed as a cronjob. It will collect and report metrics periodically.
It collects metrics through k8s pod log API.
You should print logs in {metrics name}={value} style.
In the above demo, the objective value name is *Validation-accuracy* and the metrics are [*accuracy*], so your training code should print like below.
```
epoch 1:
batch1 accuracy=0.3
Expand All @@ -323,23 +323,23 @@ You can add your template and specify `spec.metricsCollectorSpec.metricsCollecto
### TF Event File Metrics Collector

The TF Event file metrics collector will collect metrics from tf.event files.
It is also deploy as a cronjob.
When you use TF Event File Metrics Collector, you need to share files between a metrics collector and worker with PVC.
It is also deployed as a cronjob.
When you use TF Event File Metrics Collector, you need to share files between the metrics collector and the worker by PVC.
There is an example for TF Event file metrics collector.
First, please create PV and PVC for share event file.
First, please create PV and PVC to share event file.
```
$ kubectl apply -f tfevent-volume/
```
Then, create studyjob that use TF Event file metrics collector.
Then, create a studyjob that uses TF Event file metrics collector.
```
$ kubectl apply -f tf-event_test.yaml
```

It will create tensorflow worker and collect metrics from its eventfile.
It will create a tensorflow worker from whose eventfile metrics are collected.
The code of tensorflow is [the official tutorial for mnist with summary](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/tutorials/mnist/mnist_with_summaries.py).
It will save event file to `/log/train` and `/log/test` directory.
They have same named metrics ('accuracy' and 'cross_entropy').
The accuracy in training will be save in train directory and test is in test directory.
The accuracy in training and test will be saved in *train/* directory and *test/* directory respectively.
In a studyjob, please add directry name to the name of metrics as a prefix e.g. `train/accuracy`, `test/accuracy`.

## ModelManagement
Expand All @@ -351,7 +351,7 @@ katib-cli -s {{server-cli}} pull study {{study ID or name}} -o {{filename}}
```

And you can push your existing models to Katib with CLI.
`mnist-models.yaml` is traind 22 models using random suggestion with this Parameter Config.
`mnist-models.yaml` trains 22 models using random suggestion with the Parameter Config below.

```
configs:
Expand Down Expand Up @@ -383,7 +383,7 @@ configs:
- adam
- ftrl
```
You can easy to explore the model on KatibUI.
You can explore the model easily on KatibUI.

```
katib-cli push md -f mnist-models.yaml
Expand Down