Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't wait for signalled processes if subreaper is enabled #1677

Closed
sboeuf opened this issue Dec 11, 2017 · 7 comments
Closed

Don't wait for signalled processes if subreaper is enabled #1677

sboeuf opened this issue Dec 11, 2017 · 7 comments

Comments

@sboeuf
Copy link
Contributor

sboeuf commented Dec 11, 2017

In case the consumer of libcontainer sets a subreaper, we should not wait for processes when signalling them (https://github.com/opencontainers/runc/blob/master/libcontainer/init_linux.go#L467-L517). Otherwise, we have a race about the reaping, and in case the reaping is done from this function, our subreaper will not reap the expected process, meaning that we're gonna miss the exit code related to this process.

@sboeuf
Copy link
Contributor Author

sboeuf commented Dec 11, 2017

@mrunalp
Copy link
Contributor

mrunalp commented Dec 13, 2017

Makes sense to me 👍

@sboeuf
Copy link
Contributor Author

sboeuf commented Dec 13, 2017

To be completely clear, there is no problem in this case with runc because runc does not care who reap the processes as long as they are reaped. But if we think about libcontainer as a real library, then there is a problem because I have no way to make sure my reaper will be able to reap those killed processes in my case. And this is a problem because I need the exit code so that I can send it from the agent on guest OS to the shim on host OS.
runc does not need that since the container process returns the exit code to the shim monitoring it.

@crosbymichael
Copy link
Member

FYI

We handle a global subreaper in containerd with the reaper package here:

https://github.com/containerd/containerd/blob/master/reaper/reaper.go

You do have to modify your code with exec.Cmd usage to use the reaper wait functions but it works out in the long run. If this helps your issues.

@sboeuf
Copy link
Contributor Author

sboeuf commented Dec 13, 2017

@crosbymichael thanks for the link. I don't know if this can help since using this reaper code won't prevent this https://github.com/opencontainers/runc/blob/master/libcontainer/init_linux.go#L510 to be called. If something else (than my subreaper) reap the process, I have no way to tell what was the exit code of this process and I cannot send it back to the host as expected.
Am I missing something here ?

sboeuf pushed a commit to sboeuf/runc that referenced this issue Dec 14, 2017
When a subreaper is enabled, it might expect to reap a process and
retrieve its exit code. That's the reason why this patch is giving
the possibility to define the usage of a subreaper as a consumer of
libcontainer. Relying on this information, libcontainer will not
wait for signalled processes in case a subreaper has been set.

Fixes opencontainers#1677

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
@sboeuf
Copy link
Contributor Author

sboeuf commented Dec 14, 2017

@mrunalp @crosbymichael #1678

sboeuf pushed a commit to sboeuf/runc that referenced this issue Dec 14, 2017
When a subreaper is enabled, it might expect to reap a process and
retrieve its exit code. That's the reason why this patch is giving
the possibility to define the usage of a subreaper as a consumer of
libcontainer. Relying on this information, libcontainer will not
wait for signalled processes in case a subreaper has been set.

Fixes opencontainers#1677

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
sboeuf pushed a commit to sboeuf/runc that referenced this issue Dec 14, 2017
When a subreaper is enabled, it might expect to reap a process and
retrieve its exit code. That's the reason why this patch is giving
the possibility to define the usage of a subreaper as a consumer of
libcontainer. Relying on this information, libcontainer will not
wait for signalled processes in case a subreaper has been set.

Fixes opencontainers#1677

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
@laijs
Copy link
Contributor

laijs commented Dec 15, 2017

Makes sense to me too

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants