Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase memory requirement for tasks failed due to worker crash #15624

Conversation

arhimondr
Copy link
Contributor

When a task fails due to a suspected worker crash this may indicate (with a high likelihood) a potential memory accounting problem. Increasing memory requirement would essentially decrease overall cluster load allowing those problematic tasks to finish with a higher chance.

Description

When a task fails due to a suspected worker crash this may indicate
(with a high likelihood) a potential memory accounting problem.
Increasing memory requirement would essentially decrease overall cluster
load allowing those problematic tasks to finish with a higher chance.

Additional context and related issues

Release notes

(X) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

When a task fails due to a suspected worker crash this may indicate
(with a high likelihood) a potential memory accounting problem.
Increasing memory requirement would essentially decrease overall cluster
load allowing those problematic tasks to finish with a higher chance.
@arhimondr arhimondr requested a review from losipiuk January 5, 2023 21:05
@cla-bot cla-bot bot added the cla-signed label Jan 5, 2023
@arhimondr arhimondr merged commit dae2060 into trinodb:master Jan 9, 2023
@github-actions github-actions bot added this to the 406 milestone Jan 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

2 participants