-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add "set_retry()" on ContainerOp. #723
Conversation
/lgtm |
sdk/python/kfp/dsl/_container_op.py
Outdated
@@ -62,6 +62,7 @@ def __init__(self, name: str, image: str, command: str=None, arguments: str=None | |||
self.env_variables = [] | |||
self.pod_annotations = {} | |||
self.pod_labels = {} | |||
self.num_retry = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.num_retry = 0 | |
self.num_retries = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
sdk/python/kfp/dsl/_container_op.py
Outdated
@@ -288,5 +288,15 @@ def add_pod_label(self, name: str, value: str): | |||
self.pod_labels[name] = value | |||
return self | |||
|
|||
def set_retry(self, num_retry: int): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def set_retry(self, num_retry: int): | |
def set_retries(self, num_retries: int): |
Generally, it's nice to name the properties the same way as the underlying structure, but set_retry_strategy_limit
is probably too long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, the reason Argo has retryStrategy.limit
property instead of just numRetries
is probably because they want to add more ways to do the retries (e.g. exponential backoff). If that feature is added, what's our plan on supporting it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can decide once argo has more. For example, we can add set_retry_strategy or even deprecate the simple retry. But usually I am in favor of simplicity, especially we don't know argo's plan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am slightly in favor of set_retry because it is more discoverable. For example, people will probably search the APIs with "retry" but not "retries". Users should understand it is not "enable or disable" because the" num_retries" parameter is pretty clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, changed num_retry --> num_retries.
sdk/python/kfp/dsl/_container_op.py
Outdated
@@ -288,5 +288,15 @@ def add_pod_label(self, name: str, value: str): | |||
self.pod_labels[name] = value | |||
return self | |||
|
|||
def set_retry(self, num_retry: int): | |||
"""Sets retry times. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""Sets retry times. | |
"""Sets the number of times the task is retried until it's declared failed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
sdk/python/kfp/dsl/_container_op.py
Outdated
"""Sets retry times. | ||
|
||
Args: | ||
num_retry: Number of times to retry on failures. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
num_retry: Number of times to retry on failures. | |
num_retries: Number of times to retry on failures. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
sdk/python/kfp/dsl/_container_op.py
Outdated
num_retry: Number of times to retry on failures. | ||
""" | ||
|
||
self.num_retry = num_retry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.num_retry = num_retry | |
self.num_retries = num_retries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
key: secretkey | ||
name: mlpipeline-minio-artifact | ||
retryStrategy: | ||
limit: 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
limit: 5 | |
limit: 50 |
key: secretkey | ||
name: mlpipeline-minio-artifact | ||
retryStrategy: | ||
limit: 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
limit: 10 | |
limit: 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah! Nice catch. Fixed.
/lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Ark-kun The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…w#723) * deploy custom img using client sdk * Clean readme * update notebook * restructure * fix typo * Clean outputs * Add notebook checkpoints to top level gitignore
…tible mode (kubeflow#723) * remove argo template reference * address the suggestions
This change is