-
Notifications
You must be signed in to change notification settings - Fork 368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
restart helper: give container process some time to shutdown #2789
restart helper: give container process some time to shutdown #2789
Conversation
✅ Deploy Preview for devspace-docs canceled.Built without sensitive environment variables
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some ideas for reducing the repetition.
helper/util/restart_linux.go
Outdated
for i := 0; i < 5; i++ { | ||
time.Sleep(time.Second) | ||
_, err := os.Stat(path) | ||
if os.IsNotExist(err) { | ||
stderrlog.Infof("Process terminated!") | ||
return nil; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use wait.PollImmediate
here instead of a loop. It's already included in the helper it seems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. According to the docs, wait.PollImmediate
is deprecated, but the version already included in the helper does not have the replacement function yet, so i used it anyway. Would it be better to update the import?
Alternatively, we could wait for the process using the pidfd interface. Supposedly this would be slightly more efficient, but also, more importantly, it would avoid a race condition in case the pid would be reused between sending the signal and polling for termination. It would require an extra import and might make the helper somewhat less portable though (the syscall appeared in Linux 5.3).
func KillPidFd(pid int) {
fmt.Printf("Opening process %d\n", pid)
fd, err := pidfd.Open(pid, 0)
if err != nil {
fmt.Printf("Error: %s\n", err.Error())
return
}
for _, sig := range []syscall.Signal{syscall.SIGINT, syscall.SIGTERM, syscall.SIGKILL} {
fmt.Printf("Sending %s signal...\n", sig.String())
err = fd.SendSignal(sig, 0)
if err != nil {
fmt.Printf("Error: %s\n", err.Error())
return
}
fds := []unix.PollFd{{Fd: int32(fd), Events: unix.POLLIN}}
n, _ := unix.Poll(fds, 5000 /* milliseconds */)
if n > 0 {
fmt.Printf("Process terminated!\n")
return
}
}
fmt.Printf("Timeout!\n")
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already a good improvement over the original code. I think we can wait to upgrade to the replacement for wait.PollImmediate
, so that it can be upgraded outside the helper as well.
Regarding the pidfd
interface, if the race condition has been observed in your testing, and happens frequently enough, then we can consider adding it. If it hasn't been observed then we can skip for now.
helper/util/restart_linux.go
Outdated
} | ||
} | ||
|
||
stderrlog.Infof("Sending SIGTERM...") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might consider moving the wait logic into a function. Something along the lines of:
stderrlog.Infof("Sending SIGINT...")
_ = syscall.Kill(-pgid, syscall.SIGINT)
if done, err := waitForTerminated(pgid); err != nil {
return err
} else if done {
return nil
}
Signed-off-by: Maarten Deprez <mdp@si-int.eu>
69b7955
to
e02449d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Resolves ENG-2674
What issue type does this pull request address? (keep at least one, remove the others)
/kind bugfix
What does this pull request do? Which issues does it resolve?
resolves #2788
Please provide a short message that should be published in the DevSpace release notes
The helper restart command would wait for 2 microseconds (2000 nanoseconds, supposedly meant to be 2 seconds) between sending SIGINT and SIGKILL to the subcommand. Changed this to send SIGINT, SIGTERM and SIGKILL with five seconds in between, checking for process termination.
What else do we need to know?