-
Notifications
You must be signed in to change notification settings - Fork 6.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reconstruct failed actors without sending tasks. #5161
Conversation
eb7da5e
to
54ef76d
Compare
Test PASSed. |
Test FAILed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me. thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this looks good!
python/ray/tests/utils.py
Outdated
@@ -94,3 +94,25 @@ def wait_for_errors(error_type, num_errors, timeout=10): | |||
return | |||
time.sleep(0.1) | |||
raise Exception("Timing out of wait.") | |||
|
|||
|
|||
def wait_for_contition(condition_predictor, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def wait_for_contition(condition_predictor, | |
def wait_for_condition(condition_predictor, |
python/ray/tests/utils.py
Outdated
def wait_for_contition(condition_predictor, | ||
timeout_ms=1000, | ||
retry_interval_ms=100): | ||
"""A helper function that wait until a conition is met. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""A helper function that wait until a conition is met. | |
"""A helper function that waits until a condition is met. |
Test FAILed. |
Test FAILed. |
Test FAILed. |
Test FAILed. |
Test FAILed. |
Test PASSed. |
Looks like the unit test that was added failed on one of the Travis runs: https://travis-ci.com/ray-project/ray/jobs/215417720. We should increase the timeout for that test. |
thanks, increased to 5s |
Test PASSed. |
Test FAILed. |
@stephanie-wang Tests have passed. Can you give a stamp? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! :)
What do these changes do?
Previously, we had to send a task to trigger the reconstruction of a failed actor. This has issues in some cases. For example, an actor that reading data from external DB will never receive tasks. This PR fixes this issue.
Related issue number
Linter
scripts/format.sh
to lint the changes in this PR.