-
Notifications
You must be signed in to change notification settings - Fork 114
Conversation
Codecov Report
@@ Coverage Diff @@
## master #194 +/- ##
==========================================
+ Coverage 35.37% 37.34% +1.97%
==========================================
Files 11 12 +1
Lines 1631 1652 +21
==========================================
+ Hits 577 617 +40
+ Misses 969 953 -16
+ Partials 85 82 -3
Continue to review full report at Codecov.
|
grpc.go
Outdated
} | ||
|
||
cmd := exec.Command("ps", psArgs...) | ||
output, err := cmd.CombinedOutput() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tried to run this intensively ? I ask because I am afraid this command might error sometimes, due to the subreaper. Let's say you run your ps
process from the agent, the subreaper might be the one reaping the process when it terminates, meaning the cmd.Process.Wait()
called from cmd.CombinedOutput()
might return an error because it didn't properly got the exit code from the command.
@laijs @bergwolf any input on that ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, nasty... the subreaper is collecting all child exit status so we cannot use cmd.CombinedOutput()
directly. Maybe we can provide a s.subreaper.combindedoutput()
that buffers cmd stdout/stderr in a separate buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right, fixed
grpc.go
Outdated
psArgs = []string{"-ef"} | ||
} | ||
|
||
cmd := exec.Command("ps", psArgs...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC we need to enter container's pidns to call ps
, which may or may not be the global sandbox pidns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nop, c.container.Processes()
returns the list of process
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ps
command is going to run in agent's pidns, which is not the container's pidns AFAICT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, and that's ok, or I'm missing something.. ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devimc If we want to return a list of processes running inside a container, should'nt we be running the ps
command inside container's PID namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amshinde nop, c.container.Processes()
already does that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. c.container.Processes()
returns pids in the caller's namespace. So it works if the caller is in the init_ns
, which is true for the agent. Thanks for the explanation @devimc !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devimc Can you add explain/ add comments why you need to run the ps command in the agent namespace when you already have all the pids from the container?
Cant you use a predefined ps table format?
grpc.go
Outdated
} | ||
|
||
cmd := exec.Command("ps", psArgs...) | ||
output, err := cmd.CombinedOutput() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, nasty... the subreaper is collecting all child exit status so we cannot use cmd.CombinedOutput()
directly. Maybe we can provide a s.subreaper.combindedoutput()
that buffers cmd stdout/stderr in a separate buffer.
protocols/grpc/agent.proto
Outdated
|
||
// ListProcessesResponse represents the list of running processes inside the container | ||
message ListProcessesResponse { | ||
bytes processList = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
process_list
c9fcb94
to
41cb076
Compare
default: | ||
return resp, fmt.Errorf("invalid format option") | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devimc everything after this looks like it is not reachable. Am I missing something ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you mean if the format is not supported ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean that if you look at this switch
statement, there are return
for each statement, including default
. How do you expect the code further down to be reached ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, table
doesn't have a return
......
case "table":
case "json":
resp.ProcessList, err = json.Marshal(pids)
return resp, err
.......
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes I was assuming a fallthrough
here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sboeuf - I think the code is validating table
as a legitimate format type (but with no associated statement). However, the actual handling for table
is below the switch
.
@kata-containers/agent I'll address code coverage in this pr #207 |
can we merge this?, I'll fix the code coverage in #207 |
rebased 😄 |
|
||
// appends pid line | ||
for _, pid := range pids { | ||
if pid == p { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are comparing the pid from the container namespace to the one in the agent pid namespace. Can you explain why you do this? I would imagine not processes from the container pid namespace would not be visible in the agent namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added more comments in the code, like this
// Get the list of processes that are running inside the containers.
// the PIDs match with the system PIDs, not with container's namespace
pids, err := c.container.Processes()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just fmi then, which type of pid do we return - I'd hope a container pid namespace pid, not a system pid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact that you are able to match the pids from the agent pid namespace makes me think that this is returning the system pids.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry I don't understand your concerns , is the agent running inside a namespace? because I tested this patch and works, ps
returns all the processes, even I see the PID 1 (systemd)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that is sort of the crux - when you do a 'ps' of a container, does it show you the systemd process '1' of the (for want of a better phrase..) mini-OS that is in the VM. If I am understanding correctly, a ps
of a container should only show you the processes in that container - in that workload - we should not see anything, such as systemd pid1, from the mini-OS.
Maybe you can post an example output of what happens to for a 'busybox sh' container or similar to clarify what we get? Also, maybe, a ps
from inside the container (docker exec ps
) so we can compare?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with runc
$ docker run --runtime=runc -dti debian bash
d7d8f6ed73faef744b3961efdef780a4b94642b6d396bd73afa5699c32ee7d01
$ sudo docker-runc ps d7d8f6ed73faef744b3961efdef780a4b94642b6d396bd73afa5699c32ee7d01
UID PID PPID C STIME TTY TIME CMD
root 4337 4320 0 08:31 pts/0 00:00:00 bash
with kata-runtime
$ docker run --runtime=kata-runtime -dti debian bash
32b6d8eb33c750c8ebf960464a0a40860e138d24bfd13a45ee45704dbd1462bc
$ sudo kata-runtime ps 32b6d8eb33c750c8ebf960464a0a40860e138d24bfd13a45ee45704dbd1462bc
UID PID PPID C STIME TTY TIME CMD
root 147 115 0 13:44 ? 00:00:00 bash
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docker exec with runc
$ docker run --runtime=runc -dti busybox sh
5e93234a965d7e64d7a2ab918b9eaace72712137f08b6a43ce61162f45752ad0
$ docker exec 5e93234a965d7e64d7a2ab918b9eaace72712137f08b6a43ce61162f45752ad0 ps
PID USER TIME COMMAND
1 root 0:00 sh
6 root 0:00 ps
docker exec with kata-runtime
$ docker run --runtime=kata-runtime -dti busybox sh
9e1070e509f5574117e86319aff9c1c2d64a00f2d27a40b974c46f59971266c3
$ docker exec 9e1070e509f5574117e86319aff9c1c2d64a00f2d27a40b974c46f59971266c3 ps
PID USER TIME COMMAND
1 root 0:00 sh
5 root 0:00 ps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooh. I see. That is interesting - that it seems docker ps
does not use the PIDS from inside the container user namepace, but indeed shows the host space PIDS. Not really what I expected at all (but maybe there are sound reasons for runc
- that maybe you can then manipulate those processes from the guest).
Things are a little different for kata, as even with that 'guest PID', that is really the VM root PID, so there is not really much you can do with that from the guest - but, showing that is probably as near to runc
behaviour as we can get. We could debate if that is meaningful for VM containers though.
@amshinde , you agree here that this is fine then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was not expecting to see the system processes here. Looks fine to me then.
a2a806f
to
f97757b
Compare
@amshinde changes applied, thanks |
fixed code coverage 😄 |
almost done @kata-containers/agent one more review please 😃 |
@grahamwhaley @egernst @jcvenegas ping ! PTAL |
I see the centos job working correctly, maybe we hit the limit of concurrent jobs and this one was waiting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A query or two.
// pids already contains the list of processes that are running | ||
// inside a container, now we have to use that list to filter | ||
// ps output and return just container's processes | ||
cmd := exec.Command("ps", psArgs...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is of course assuming that ps
is available in the VM rootfs is it not? I'm not sure we have a 100% guarantee, or a documented 100% requirement, for that do we?
The alternative is to peruse the /proc (and/or the process namespace related) files, which would require no extra helper apps? That may also mean we don't have to try and parse the string output, which might be easier as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with gwhaley. Also, we should really avoid execing if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see https://github.com/opencontainers/runc/blob/master/ps.go#L63
the idea behind runtime's ps
sub-command is to list the processes using linux ps command, "peruse the /proc" is not an option since the user decides the output format and the information to show. Would be nice to have an implementation of ps in go
to include it here, but as far as I know there is no. If you concern is that we don't include ps
command in the image, don't worries I already filed an issue for that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, so the user can supply arguments to ps
from the docker command line? In which case, yes, it makes more sense to use ps
directly.
I will note, you might find that some implementations of ps
(notably busybox, which then also implies Alpine I believe), may act differently and/or not support all options that you get with the normal 'full ps'. So, you may wish to beware and test that out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, you can specify ps options using docker top
$ docker top --help
Usage: docker top CONTAINER [ps OPTIONS]
sure thing, I'll consider alpine case
|
||
// appends pid line | ||
for _, pid := range pids { | ||
if pid == p { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just fmi then, which type of pid do we return - I'd hope a container pid namespace pid, not a system pid?
ListProcesses returns a list of running processes inside the container, this function should be called by the runtime in the ps command implementation. fixes kata-containers#193 Signed-off-by: Julio Montes <julio.montes@intel.com>
add unit tests for mockreaper run and combinedOutput functions Signed-off-by: Julio Montes <julio.montes@intel.com>
mockContainer is a mock struct to write unit tests Signed-off-by: Julio Montes <julio.montes@intel.com>
mockreaper and mockcontainer are used to implement TestListProcesses. This patch increases a little the code coverage. Signed-off-by: Julio Montes <julio.montes@intel.com>
rebased |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@grahamwhaley I think you can merge this now (after you've ack'ed it !) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
ListProcesses returns a list of running processes inside
the container, this function should be called by the runtime
in the ps command implementation.
fixes #193
Signed-off-by: Julio Montes julio.montes@intel.com