[18.09 backport] Fix leaking task resources when nodes are deleted #2842
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
backport of #2806 for the bump_v18.09 branch
When a node is deleted, its tasks are asked to restart, which involves
putting them into a desired state of Shutdown. However, the Allocator
will not deallocate a task which is not in an actual state of a terminal
state. Once a node is deleted, the only opportunity for its tasks to
recieve updates and be moved to a terminal state is when the function
moving those tasks to TaskStateOrphaned is called, 24 hours after the
node enters the Down state. However, if a leadership change occurs, then
that function will never be called, and the tasks will never be moved to
a terminal state, leaking resources.
With this change, upon node deletion, all of its tasks will be moved to
TaskStateOrphaned, allowing those tasks' resources to be cleaned up.