Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[18.09 backport] Fix leaking task resources when nodes are deleted #2842

Merged

Conversation

thaJeztah
Copy link
Member

backport of #2806 for the bump_v18.09 branch

When a node is deleted, its tasks are asked to restart, which involves
putting them into a desired state of Shutdown. However, the Allocator
will not deallocate a task which is not in an actual state of a terminal
state. Once a node is deleted, the only opportunity for its tasks to
recieve updates and be moved to a terminal state is when the function
moving those tasks to TaskStateOrphaned is called, 24 hours after the
node enters the Down state. However, if a leadership change occurs, then
that function will never be called, and the tasks will never be moved to
a terminal state, leaking resources.

With this change, upon node deletion, all of its tasks will be moved to
TaskStateOrphaned, allowing those tasks' resources to be cleaned up.

When a node is deleted, its tasks are asked to restart, which involves
putting them into a desired state of Shutdown. However, the Allocator
will not deallocate a task which is not in an actual state of a terminal
state. Once a node is deleted, the only opportunity for its tasks to
recieve updates and be moved to a terminal state is when the function
moving those tasks to TaskStateOrphaned is called, 24 hours after the
node enters the Down state. However, if a leadership change occurs, then
that function will never be called, and the tasks will never be moved to
a terminal state, leaking resources.

With this change, upon node deletion, all of its tasks will be moved to
TaskStateOrphaned, allowing those tasks' resources to be cleaned up.

Signed-off-by: Drew Erny <drew.erny@docker.com>
(cherry picked from commit 8467e6a)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
@thaJeztah
Copy link
Member Author

ping @dperny PTAL

@codecov
Copy link

codecov bot commented Mar 26, 2019

Codecov Report

Merging #2842 into bump_v18.09 will increase coverage by 0.17%.
The diff coverage is 85.71%.

@@              Coverage Diff               @@
##           bump_v18.09   #2842      +/-   ##
==============================================
+ Coverage        61.63%   61.8%   +0.17%     
==============================================
  Files              134     134              
  Lines            21871   21872       +1     
==============================================
+ Hits             13480   13518      +38     
+ Misses            6932    6893      -39     
- Partials          1459    1461       +2

@dperny dperny merged commit 19e791f into moby:bump_v18.09 Mar 28, 2019
@thaJeztah thaJeztah deleted the 18.09_backport_fix_leaking_task_resources branch March 28, 2019 20:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants