Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MetricsCollector does not parse time of the metric #944

Closed
andreyvelich opened this issue Dec 2, 2019 · 5 comments · Fixed by #970
Closed

MetricsCollector does not parse time of the metric #944

andreyvelich opened this issue Dec 2, 2019 · 5 comments · Fixed by #970
Assignees
Labels

Comments

@andreyvelich
Copy link
Member

/kind bug

I can't see time in observation_logs table after experiment finish.

What steps did you take and what happened:
I ran random-example on my GCP cluster. In the Trial's containers I got warning like this:

W1202 17:26:56.160885      15 file-metricscollector.go:50] Metrics will not have timestamp since error parsing time INFO:root:Epoch[0]: parsing time "INFO:root:Epoch[0]" as "2006-01-02T15:04:05.999999999Z07:00": cannot parse "INFO:root:Epoch[0]" as "2006"

After that, I checked information in observation_logs table and I saw this time: 001-01-01T00:00:00 in each recorded metric.
From my understanding, if we use default metrics collector, it parses /var/log/katib/metrics.log file. This file doesn't contain timestamp of the metric and we can't get information about metrics changing through time.
Maybe we can save also timestamp for all lines in the log file?
Or we can find better solution to indicate timestamp of the metrics somehow.
/cc @hougangliu @gaocegege @johnugeorge

@hougangliu
Copy link
Member

default metricsCollector always try to parse each line of user-application log, and timestamp is expected at start of the line, if there is no timestamp, ignore timestamp (use NULL in time field).
So maybe we can fix the example image and make its log follow this format

@andreyvelich
Copy link
Member Author

@hougangliu Yes, I agree we need to create example with printing timestamp in training container, like we did with file-metrics-collector.
Will take it.
/assign

@andreyvelich
Copy link
Member Author

andreyvelich commented Dec 11, 2019

@hougangliu Should I use docker.io/kubeflowkatib repo to push my image or I should use my own docker hub?

@hougangliu
Copy link
Member

we should use docker.io/kubeflowkatib, please leave your docker hub account so that I can add you to the org

@andreyvelich
Copy link
Member Author

@hougangliu Sure, my id is: andreyvelichkevich

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants