You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.
In remote experiment, when one of the nodes by any chance goes off, there is no way to recover from that state. In webUI this trial just goes in "UNKNOWN" state and that's it. You can't cancel current trial and let it start new one when machine is back online. Especially when it's docker, and node doesn't even have trial src in /tmp folder anymore.
Feature request: enable "cancel" action for UNKNOWN trial state, which disregards current trial and tries to start new one.
Thanks!
The text was updated successfully, but these errors were encountered:
In remote experiment, when one of the nodes by any chance goes off, there is no way to recover from that state. In webUI this trial just goes in "UNKNOWN" state and that's it. You can't cancel current trial and let it start new one when machine is back online. Especially when it's docker, and node doesn't even have trial src in /tmp folder anymore.
Feature request: enable "cancel" action for
UNKNOWN
trial state, which disregards current trial and tries to start new one.Thanks!
The text was updated successfully, but these errors were encountered: