Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example with collecting timestamp of the metrics #970

Merged
Merged
Show file tree
Hide file tree
Changes from 19 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
bebf6c0
Increase Suggestion memLimit
andreyvelich Dec 9, 2019
f1bd3ca
Create getSuggestionConfigData function
andreyvelich Dec 10, 2019
cafd115
Change memLimit for nasrl
andreyvelich Dec 10, 2019
2e57536
Merge remote-tracking branch 'upstream/master' into increase-suggesti…
andreyvelich Dec 10, 2019
f08fd42
Change resources format for katib-config
andreyvelich Dec 11, 2019
c8acc67
Merge remote-tracking branch 'upstream/master' into issue-944-timesta…
andreyvelich Dec 11, 2019
30eaad5
Create example with recording metrics timestamp
andreyvelich Dec 11, 2019
f840692
Merge remote-tracking branch 'upstream/master' into issue-944-timesta…
andreyvelich Dec 12, 2019
1aa3a8e
Add comment line
andreyvelich Dec 12, 2019
d52bd9d
Merge remote-tracking branch 'upstream/master' into issue-944-timesta…
andreyvelich Jan 8, 2020
923902a
Change example from pytorch to mxnet
andreyvelich Jan 8, 2020
9954e6b
Delete find_mxnet file
andreyvelich Jan 8, 2020
d3cbdda
Change mxnet-mnist-timestamp to mxnet-mnist
andreyvelich Jan 10, 2020
68979df
Merge remote-tracking branch 'upstream/master' into issue-944-timesta…
andreyvelich Jan 10, 2020
b3ca4ae
Reduce num epochs in grid
andreyvelich Jan 10, 2020
a714749
Enable autoscaling in CI cluster
andreyvelich Jan 10, 2020
933fb3c
Add max nodes
andreyvelich Jan 10, 2020
6710326
Add num nodes 6
andreyvelich Jan 10, 2020
ccbd475
Increase num nodes
andreyvelich Jan 10, 2020
0c651a5
Change num nodes to 6
andreyvelich Jan 13, 2020
f25c13d
Remove autoscaling
andreyvelich Jan 13, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions examples/v1alpha3/bayesianoptimization-example.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ spec:
goal: 0.99
objectiveMetricName: Validation-accuracy
additionalMetricNames:
- accuracy
- Train-accuracy
algorithm:
algorithmName: bayesianoptimization
algorithmSettings:
Expand Down Expand Up @@ -51,10 +51,10 @@ spec:
spec:
containers:
- name: {{.Trial}}
image: docker.io/kubeflowkatib/mxnet-mnist-example
image: docker.io/kubeflowkatib/mxnet-mnist
command:
- "python"
- "/mxnet/example/image-classification/train_mnist.py"
- "python3"
- "/opt/mxnet-mnist/mnist.py"
- "--batch-size=64"
{{- with .HyperParameters}}
{{- range .}}
Expand Down
12 changes: 6 additions & 6 deletions examples/v1alpha3/grid-example.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ spec:
goal: 0.99
objectiveMetricName: Validation-accuracy
additionalMetricNames:
- accuracy
- Train-accuracy
algorithm:
algorithmName: grid
parallelTrialCount: 3
Expand All @@ -32,8 +32,8 @@ spec:
- name: --num-epochs
parameterType: int
feasibleSpace:
min: "20"
max: "40"
min: "10"
max: "15"
# Grid doesn't support categorical, refer to https://chocolate.readthedocs.io/api/sample.html#chocolate.Grid
# - name: --optimizer
# parameterType: categorical
Expand All @@ -55,10 +55,10 @@ spec:
spec:
containers:
- name: {{.Trial}}
image: docker.io/kubeflowkatib/mxnet-mnist-example
image: docker.io/kubeflowkatib/mxnet-mnist
command:
- "python"
- "/mxnet/example/image-classification/train_mnist.py"
- "python3"
- "/opt/mxnet-mnist/mnist.py"
- "--batch-size=64"
{{- with .HyperParameters}}
{{- range .}}
Expand Down
8 changes: 4 additions & 4 deletions examples/v1alpha3/hyperband-example.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ spec:
goal: 0.99
objectiveMetricName: Validation-accuracy
additionalMetricNames:
- accuracy
- Train-accuracy
algorithm:
algorithmName: hyperband
algorithmSettings:
Expand Down Expand Up @@ -58,10 +58,10 @@ spec:
spec:
containers:
- name: {{.Trial}}
image: kubeflowkatib/mxnet-mnist-example
image: docker.io/kubeflowkatib/mxnet-mnist
command:
- "python"
- "/mxnet/example/image-classification/train_mnist.py"
- "python3"
- "/opt/mxnet-mnist/mnist.py"
- "--batch-size=64"
{{- with .HyperParameters}}
{{- range .}}
Expand Down
13 changes: 13 additions & 0 deletions examples/v1alpha3/mxnet-mnist/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
FROM ubuntu:16.04

RUN apt-get update && \
apt-get install -y wget python3-dev gcc && \
wget https://bootstrap.pypa.io/get-pip.py && \
python3 get-pip.py

RUN pip3 install mxnet

ADD . /opt/mxnet-mnist
WORKDIR /opt/mxnet-mnist

ENTRYPOINT ["python3", "/opt/mxnet-mnist/mnist.py"]
6 changes: 6 additions & 0 deletions examples/v1alpha3/mxnet-mnist/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Mxnet image classification example
This is Mxnet image classification training container with recording time of the metrics.

It uses only simple multilayer perceptron network (mlp).

If you want to read more about this example, visit official [incubator-mxnet](https://github.com/apache/incubator-mxnet/tree/master/example/image-classification) github repository.
Empty file.
Loading