-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Teleport signal handling and live reload. #1679
Conversation
19578b5
to
909a554
Compare
retest this please |
8bdde0e
to
b890087
Compare
lib/backend/dir/impl.go
Outdated
} | ||
return trace.ConvertSystemError(err) | ||
} | ||
if err := syscall.Flock(int(f.Fd()), syscall.LOCK_EX); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use writeLock()
method from above?
lib/backend/dir/impl.go
Outdated
return trace.ConvertSystemError(err) | ||
} | ||
defer f.Close() | ||
if err := writeLock(f); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't this file just locked a couple of lines above?
const teleportFilesEnvVar = "TELEPORT_OS_FILES" | ||
|
||
func execPath() (string, error) { | ||
name, err := exec.LookPath(os.Args[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's std os.Executable()
method that seems to do what you need.
tool/teleport/common/teleport.go
Outdated
@@ -194,7 +173,7 @@ func Run(options Options) (executedCommand string, conf *service.Config) { | |||
if err != nil { | |||
utils.FatalError(err) | |||
} | |||
log.Info("teleport: clean exit") | |||
log.Debugf("Clean exit.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug
.
tool/teleport/common/teleport.go
Outdated
if ccf.Gops { | ||
log.Debugf("starting gops agent") | ||
err := agent.Listen(&agent.Options{Addr: ccf.GopsAddr}) | ||
log.Debugf("Starting gops agent.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug
.
} | ||
warnOnErr(sshProxy.Close()) | ||
} else { | ||
log.Infof("Shutting down gracefully.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Info
.
log.Infof("Proxy service exited.") | ||
defer listeners.Close() | ||
if payload == nil { | ||
log.Infof("Shutting down immediately.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Info
.
} | ||
log.Infof("Exited.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Info
.
@@ -402,7 +402,7 @@ func (s *server) diffConns(newConns, existingConns map[string]services.TunnelCon | |||
} | |||
|
|||
func (s *server) Wait() { | |||
s.srv.Wait() | |||
s.srv.Wait(context.TODO()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to have a caller pass context to the outer method (similar to Shutdown below)? Or is it too much refactoring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, no one is using it now, so I decided to pass
if err != nil { | ||
err = trace.ConvertSystemError(err) | ||
if !trace.IsNotFound(err) { | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trace.Wrap(err)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ConvertSystemError wraps
For folks who are interested why are we using flock vs other types of locks, read this article: https://gavv.github.io/blog/file-locks/ and this one is also fun: |
for i := 0; i < attempts; i++ { | ||
go func(cnt int) { | ||
err := s.bk.UpsertVal(bucket, "key", []byte("new-value"), backend.Forever) | ||
err := s.bk.UpsertVal(bucket, "key", []byte(value1), time.Hour) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
switch to create back
} | ||
s.Infof("Shutdown: waiting for %v connections to finish.", activeConnections) | ||
lastReport := time.Time{} | ||
ticker := time.NewTicker(s.shutdownPollPeriod) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defer ticker.Stop()
b890087
to
731ac3b
Compare
This commit introduces signal handling. Parent teleport process is now capable of forking the child process and passing listeners file descriptors to the child. Parent process then can gracefully shutdown by tracking the amount of current connections and closing listeners once the amount goes to 0. Here are the signals handled: * USR2 signal will cause the parent to fork a child process and pass listener file descriptors to it. Child process will close unused file descriptors and will bind to the used ones. At this moment two processes - the parent and the forked child process will be serving requests. After looking at the traffic and the log files, administrator can either shut down the parent process or the child process if the child process is not functioning as expected. * TERM, INT signals will trigger graceful process shutdown. Auth, node and proxy processes will wait until the amount of active connections goes down to 0 and will exit after that. * KILL, QUIT signals will cause immediate non-graceful shutdown. * HUP signal combines USR2 and TERM signals in a convenient way: parent process will fork a child process and self-initate graceful shutdown. This is a more convenient than USR2/TERM sequence, but less agile and robust as if the connection to the parent process drops, but the new process exits with error, administrators can lock themselves out of the environment. Additionally, boltdb backend has to be phased out, as it does not support read/writes by two concurrent processes. This had required refactoring of the dir backend to use file locking to allow inter-process collaboration on read/write operations.
731ac3b
to
68b65f5
Compare
This commit introduces signal handling.
Parent teleport process is now capable of forking
the child process and passing listeners file descriptors
to the child.
Parent process then can gracefully shutdown
by tracking the amount of current connections and
closing listeners once the amount goes to 0.
Here are the signals handled:
a child process and pass listener file descriptors to it.
Child process will close unused file descriptors
and will bind to the used ones.
At this moment two processes - the parent
and the forked child process will be serving requests.
After looking at the traffic and the log files,
administrator can either shut down the parent process
or the child process if the child process is not functioning
as expected.
TERM, INT signals will trigger graceful process shutdown.
Auth, node and proxy processes will wait until the amount
of active connections goes down to 0 and will exit after that.
KILL, QUIT signals will cause immediate non-graceful
shutdown.
HUP signal combines USR2 and TERM signals in a convenient
way: parent process will fork a child process and
self-initate graceful shutdown. This is a more convenient
than USR2/TERM sequence, but less agile and robust
as if the connection to the parent process drops, but
the new process exits with error, administrators
can lock themselves out of the environment.
Additionally, boltdb backend has to be phased out,
as it does not support read/writes by two concurrent
processes. This had required refactoring of the dir
backend to use file locking to allow inter-process
collaboration on read/write operations.