-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add restart policy & scheduler name for workflow pods #1109
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree we should provide ability to specify scheduler name and restart policy but on step level not whole workflow. User would have to repeat settings but this problem should be solved as part of #799
@jessesuen , need your opinion about restart policy . First I thought we don't need it since RetryStrategy
is available. After some thinking I've decided it is useful. User might want to chose pod restart policy to make sure retry happens on the same node.
|
@houz42 I think the scheduler is a fine addition. However a restartPolicy of OnFailure is problematic to set because restartPolicy is a pod spec level setting, and it will apply to the wait sidecar as well. The current design is that the controller relies on the fact the wait sidecar will exit with non-zero in many situations to understand the status of the step. For example, the entire wait logic will return non-zero if any of the following goes wrong:
In order to support a restartPolicy of OnFailure, we would need to modify the executor to always exit zero, and communicate back to the controller, that a step had failed in a different way. It's unclear what this mechanism would be. One thought is: we currently already use pod annotations to communicate error messages to the controller. The controller could be modified such that it always expects a pod annotation to be set (even on success). Then, if the pod completed without setting an annotation, then something went wrong and the controller could fail the step. So as it stands, supporting
|
@jessesuen maybe I should submit changes on schedulerName first and consider on restartPolicy later. |
Yes, scheduler only changes would be fine. |
add scheduler name only in #1184 |
* chore: deprecate in v1.5 comments Signed-off-by: Derek Wang <whynowy@gmail.com>
RestartPolicy and SchedulerName are useful for controlling pods running of workflow.