bug: process start time mismatch #1393

crosbymichael · 2017-03-27T21:32:09Z

Report from a docker issue and was able to have another user reproduce this error.

I added some code to test the exact place that is causing this to confirm that on a centos system, the start time that runc records vs additional stat calls are not matching up.

diff --git a/libcontainer/container_linux.go b/libcontainer/container_linux.go
index 82c6d8e..8bd0edf 100644
--- a/libcontainer/container_linux.go
+++ b/libcontainer/container_linux.go
@@ -1132,7 +1132,7 @@ func (c *linuxContainer) doesInitProcessExist(initPid int) (bool, error) {
 		return false, newSystemErrorWithCausef(err, "getting init process %d start time", initPid)
 	}
 	if c.initProcessStartTime != startTime {
-		return false, nil
+		return false, newSystemError(fmt.Errorf("process start time and recorded start time do not match"))
 	}
 	return true, nil
 }
@@ -1186,11 +1186,15 @@ func (c *linuxContainer) currentState() (*State, error) {
 	var (
 		startTime           string
 		externalDescriptors []string
+		err                 error
 		pid                 = -1
 	)
 	if c.initProcess != nil {
 		pid = c.initProcess.pid()
-		startTime, _ = c.initProcess.startTime()
+		startTime, err = c.initProcess.startTime()
+		if err != nil {
+			return nil, err
+		}
 		externalDescriptors = c.initProcess.externalDescriptors()
 	}
 	state := &State{
diff --git a/libcontainer/system/proc_test.go b/libcontainer/system/proc_test.go
index ba91029..864a458 100644
--- a/libcontainer/system/proc_test.go
+++ b/libcontainer/system/proc_test.go
@@ -7,6 +7,7 @@ func TestParseStartTime(t *testing.T) {
 		"4902 (gunicorn: maste) S 4885 4902 4902 0 -1 4194560 29683 29929 61 83 78 16 96 17 20 0 1 0 9126532 52965376 1903 18446744073709551615 4194304 7461796 140733928751520 140733928698072 139816984959091 0 0 16781312 137447943 1 0 0 17 3 0 0 9 0 0 9559488 10071156 33050624 140733928758775 140733928758945 140733928758945 140733928759264 0": "9126532",
 		"9534 (cat) R 9323 9534 9323 34828 9534 4194304 95 0 0 0 0 0 0 0 20 0 1 0 9214966 7626752 168 18446744073709551615 4194304 4240332 140732237651568 140732237650920 140570710391216 0 0 0 0 0 0 0 17 1 0 0 0 0 0 6340112 6341364 21553152 140732237653865 140732237653885 140732237653885 140732237656047 0":                                      "9214966",
 		"24767 (irq/44-mei_me) S 2 0 0 0 -1 2129984 0 0 0 0 0 0 0 0 -51 0 1 0 8722075 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 0 0 0 17 1 50 1 0 0 0 0 0 0 0 0 0 0 0":                                                                                                                                                                         "8722075",
+		"22997 (rethinkdb) S 22986 22997 22997 0 -1 1077944320 13867 438 8 0 20 7 0 0 20 0 69 0 27005578 100900864 8449 18446744073709551615 1 1 0 0 0 0 0 4096 17474 18446744073709551615 0 0 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0":                                                                                                                           "27005578",
 	}
 	for line, startTime := range data {
 		st, err := parseStartTime(line)

The error that we are seeing with my patch is exec failed: process start time and recorded start time do not match

How is this possible? Why centos? Why!

uname -a
Linux ip-10-0-0-86 3.10.0-327.10.1.el7.x86_64 #1 SMP Tue Feb 16 17:03:50 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

I've been trying to debug this but no clue what is going on. Does anyone else have any ideas how this mismatch can be possible? It looks like this was the same type of error that was fixed in #1136 but i have confirmed we are running with a build that contains that commit.

The text was updated successfully, but these errors were encountered:

crosbymichael · 2017-03-27T21:41:04Z

Selinux is set to Enforcing but not seeing any EPERM or anything in the audit logs related

rhatdan · 2017-03-27T21:42:31Z

I doubt it is an SELinux issue.

crosbymichael · 2017-03-27T21:43:03Z

@rhatdan ya, me too. Was just adding it as a data point

cyphar · 2017-03-27T23:06:43Z

It looks like the CentOS kernel is missing this kernel patch: torvalds/linux@266b7a0. Basically when a multithreaded process does an execve(2) the process that actually is scheduled is random and thus the tsk->start_time and tsk->real_start_time need to be modified to match the old threadgroup leader. /proc/<pid>/stat shows us real_start_time but it appears that real_start_time was not updated until the linked patch in 2013 (it was broken in 2007).

crosbymichael · 2017-03-27T23:53:56Z

I'll submit my changes to make the error messages return from runc more explicit

cyphar · 2017-03-28T00:02:52Z

My worry with this patch is that it converts our silent pid reuse detection into an error condition, which means that if you have a lot of process churn and containers being started and stopped then you'll start getting errors when runC would be able to take care of it.

Maybe we should just log to stderr rather than actually bail in this case?

crosbymichael · 2017-03-28T18:11:01Z

@cyphar its always been an error condition hence why it was reported.

How can we take care of it? The only alternative is to remove the check and just perform the actions.

cyphar · 2017-03-29T07:39:37Z

@crosbymichael But it's not an error condition (unless things are broken as we've seen in this issue). We check the start time to avoid PID recycling attacks, to make it so that if a container's init dies then the next time we run a runC command then the state will be correctly updated if a container's init died and a new process started with the same PID (which requires us making sure that the start time is the same). But if we make a different start time an error in doesInitProcessExist then that won't work anymore AFAICS.

To be honest, I'm not really sure how we can deal with kernels where the start time is wrong, but since it's a kernel bug that was fixed in 2013. Maybe we could come up with another way of disambiguating processes but the kernel actually makes a guarantee that start_time will always be the same no matter how many times the process execs.

dqminh · 2017-03-29T11:35:25Z

i agree with @cyphar that this is not an error condition when we detect that the PID has been recycled. We will just mark the container state as stopped and continue in that case.

The error here comes from the fact that the kernel is buggy. Either detect that the system is running on a buggy kernel so we can bail early, or find another way to disambiguate processes would be nice, but so far i don't have any nice clue how to achieve either of them in runc.

crosbymichael · 2017-04-03T16:52:37Z

@dqminh the problem is that when we mark the container as stopped you cannot do certain operations and that is what fails. you cannot exec a stopped container, etc.

crosbymichael · 2017-04-04T18:33:58Z

@cyphar @dqminh So do you think we should just remove the check as it does block some operations or should we keep it as is today and suggest that people upgrade their kernel?

Is there a better way to check that the pid is our pid or is this not really an issue?

cyphar · 2017-04-16T22:20:49Z

So do you think we should just remove the check as it does block some operations or should we keep it as is today and suggest that people upgrade their kernel?

People should upgrade their kernels IMO -- this is a buggy kernel issue that was fixed several years ago.

Is there a better way to check that the pid is our pid or is this not really an issue?

My proposal in #1224 is another possible way of doing it, but it has the downside of meaning you have to futz with the host's mount table. And it will break rootless. IMO the "right way" is the way we're doing it now -- unfortunately there was a kernel bug in the past with this technique but it was fixed a while ago.

evantorrie · 2017-05-25T20:44:18Z

What version of the Centos kernel were you testing?

It appears that this kernel fix referred to in #1393 (comment) is in the 3.10.0-514.16.1 version released as part of Centos 7.3.1611

cyphar · 2017-05-25T22:30:23Z

@evantorrie At the time when I wrote that comment I was checking the latest sources from CentOS. I talked to some RedHat folks and they patched their kernel so it should work now (AFAIK). Closing.

cyphar closed this as completed May 25, 2017

friism mentioned this issue Sep 7, 2017

rpc error: code = 13 desc = invalid header field value "oci runtime error: exec failed: cannot exec a container that has run and stopped\n" moby/moby#29794

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: process start time mismatch #1393

bug: process start time mismatch #1393

crosbymichael commented Mar 27, 2017

crosbymichael commented Mar 27, 2017

rhatdan commented Mar 27, 2017

crosbymichael commented Mar 27, 2017

cyphar commented Mar 27, 2017

crosbymichael commented Mar 27, 2017

cyphar commented Mar 28, 2017

crosbymichael commented Mar 28, 2017

cyphar commented Mar 29, 2017

dqminh commented Mar 29, 2017

crosbymichael commented Apr 3, 2017

crosbymichael commented Apr 4, 2017

cyphar commented Apr 16, 2017

evantorrie commented May 25, 2017

cyphar commented May 25, 2017

bug: process start time mismatch #1393

bug: process start time mismatch #1393

Comments

crosbymichael commented Mar 27, 2017

crosbymichael commented Mar 27, 2017

rhatdan commented Mar 27, 2017

crosbymichael commented Mar 27, 2017

cyphar commented Mar 27, 2017

crosbymichael commented Mar 27, 2017

cyphar commented Mar 28, 2017

crosbymichael commented Mar 28, 2017

cyphar commented Mar 29, 2017

dqminh commented Mar 29, 2017

crosbymichael commented Apr 3, 2017

crosbymichael commented Apr 4, 2017

cyphar commented Apr 16, 2017

evantorrie commented May 25, 2017

cyphar commented May 25, 2017