-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timeouts on calls to docker api #12310
Conversation
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
@@ -316,7 +338,10 @@ func (w *watcher) listContainers(options types.ContainerListOptions) ([]*Contain | |||
// If there are no network interfaces, assume that the container is on host network | |||
// Inspect the container directly and use the hostname as the IP address in order |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assumption is actually incorrect. Docker can have some of known network modes: default (bridge), host and container (usually used in cloud platforms like kubernetes).
It can be determined by checking .HostConfig.NetworkMode property:
$ sudo curl -s --fail -m3 -XGET --unix-socket /var/run/docker.sock http://localhost/containers/json | jq -r .[].HostConfig.NetworkMode
container:b79efb8d2710b3fb2ae6b9fff5e1cf6f8e841301df9a355371c197156d72c3d4
container:a7fb40e3f13684e91f26c1ed26096e0d39df5d7860db31bab5f924cec5838366
default
default
default
host
container:9f187b5c30cb2ceb43907a2513e093f1039be42c230c33d70a601e9c3173dc5d
default
default
if network mode is container:<ID>
this <ID>
should be inspected to fetch from standard config .NetworkSettings.Networks
like above.
But this is subject for separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds about right! are you willing to open a new PR? 😇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will try to take a shot on this next week
libbeat/common/docker/watcher.go
Outdated
case <-tickChan.C: | ||
if time.Since(time.Unix(w.lastValidTimestamp, 0)) > 10*time.Minute { | ||
logp.Info("No events received withing 10 minutes, restarting watch call") | ||
w.lastValidTimestamp = time.Now().Unix() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why updating this here? If watch call has gone mad maybe we missed some previous events?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed code to use separate var for tracking watch activity
libbeat/common/docker/watcher.go
Outdated
return | ||
} | ||
} | ||
|
||
tickChan.Stop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this won't be called if return is hit before (on watcher stop), you could probably use defer stop after starting it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
libbeat/common/docker/watcher.go
Outdated
} | ||
} | ||
|
||
func (w *watcher) listContainers(options types.ContainerListOptions) ([]*Container, error) { | ||
containers, err := w.client.ContainerList(w.ctx, options) | ||
logp.Debug("docker", "List containers") | ||
ctx, cancel := context.WithTimeout(w.ctx, 10*time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you please make all new timeout constants? this should make it easier to fine tune
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for taking this! really appreciated 🎉 🎉
Thank you for contributing!! 🎉 jenkins, test this please |
waiting for green |
* Add timeouts on calls to docker api
@exekias how about including this in next 7.x release? |
definitely! this made the cut for 7.3, so it will be part of it |
Adds timeouts on calls to docker api:
Docker has long history of problems with unresponsive API. If unlucky docker watcher can stuck silently forever on calls to docker API. Adding some timeouts should prevent possibility of this happening.