Add initial lifecycle and hooks into kubelet. #1028

brendandburns · 2014-08-25T22:06:17Z

No description provided.

lavalamp · 2014-08-25T22:10:16Z

pkg/kubelet/lifecycle.go

+
+// Lifecycle is an interface implemented by things that handle lifecycle events
+type Lifecycle interface {
+	// OnStart is called when a container is started.  The management system blocks until it returns


Before or after container start?

after. (updated)

brendandburns · 2014-08-25T22:21:30Z

comments (mostly) addressed, ptal.

thanks!
--brendan

lavalamp · 2014-08-25T22:27:54Z

pkg/kubelet/lifecycle.go

+type Lifecycle interface {
+	// OnStart is called after a container is started.  The management system blocks until it returns
+	OnStart(containerID string) (LifecycleOutput, error)
+	// OnStop is called before a container is stopped.  The management system blocks until it returns


I'd suggest the names PostStart and PreStop. Also, I'm concerned about blocking indefinitely. Should have a timeout?

Names changed. I would like to leave them as blocking indefinitely. I think that these need to be true lifecycle events, not notifications. The call needs to return before management can proceed.

(I agree that practically we probably need a timeout, or other guarantees to "kill things that are not responding" but I'd like to start out of the gate with the right model, and then add the safety valve later)

I don't want development of lifecycle to proceed assuming that communication is one way from kubelet to container.

If you feel strongly about the safety side of things, I'll spend the time to expand this into something with timeouts, but I'd prefer to do that in a different PR.

(Seems to be missing a push.) I'm not sure I buy that indefinite wait is the right model. @thockin may have thoughts on this?

My thoughts about wait are this:

Management should wait until some timeout. Once the timeout is reached, the management system should unequivocally kill the container as "not responding" the same as OS X or Android.

I can implement that behavior in this PR, but I'd rather do it in a follow up.

My suggestion is to add a timeout to the function calls, with the expectation that after that time the functions will return with an error. I'm being picky about this because you really need to pass the timeout down into the script you're calling as a parameter. I don't care if you implement in another PR, but I think the timeout really deserves to make it into the interface for the first cut, because it's really very important.

People internally use shutdown notification to do important cleanup, I think fairly commonly. We have a configurable time limit. I'm not saying you should make it configurable with this change, but I think it's very important that the scripts know what their time limit is.

I guess my point is that I think that model is broken. The management
system shouldn't give you a quota of time and hope you can deal with it.
It should stick around for as long as it possibly can, and then treat your
container as broken (similar to failing health checks) if you fail to
respond within the maximum time it can allot to lifecycle events, it seems
subtle, but I think it's a really important difference.

In one model, the container and management system are partners, in the
other, the management system is in charge, and simply notifying containers
"this is the way it is"

I too am being detailed here, because I think it's really important. The
"notification" style approach leads to a style of lifecycle that is much
more advisorial, than contract driven, and that's precisely what I'm trying
to avoid.

--brendan

On Mon, Aug 25, 2014 at 9:12 PM, Daniel Smith notifications@github.com
wrote:

In pkg/kubelet/lifecycle.go:

+See the License for the specific language governing permissions and
+limitations under the License.
+*/
+
+package kubelet
+
+// LifecycleOutput is the output from a lifecycle event
+type LifecycleOutput struct {

Details string
+}

+// Lifecycle is an interface implemented by things that handle lifecycle events
+type Lifecycle interface {

// OnStart is called after a container is started. The management system blocks until it returns

OnStart(containerID string) (LifecycleOutput, error)

// OnStop is called before a container is stopped. The management system blocks until it returns

People internally use shutdown notification to do important cleanup, I
think fairly commonly. We have a configurable time limit. I'm not saying
you should make it configurable with this change, but I think it's very
important that the scripts know what their time limit is.

—
Reply to this email directly or view it on GitHub
https://github.com/GoogleCloudPlatform/kubernetes/pull/1028/files#r16695049
.

I think the management system has to give you a quota of time (and if you can't deal with it, you get killed unceremoniously). Otherwise rogue tasks can eat up all our resources. And if there is a quota, it's imperative that we tell the user what it is, so they know what they reasonably have a chance of running. Otherwise you can get nondeterministic behavior with folks' scripts sometimes completing and sometimes not, with no obvious fix.

There's no question about who is in the driver's seat here. Timeout makes it very clear to the task what is about to happen to it, to let it best use its remaining cycles.

I agree with a lot of what @lavalamp says; however, I think there is certainly room to do something different than what we do in the internal system that @brendandburns dislikes.

We can certainly make a system that lets some pods be able to take a whole day to respond to a lifecycle event, or more. But if we have to assume that all pods might take a day, or whatever time limit you chose, it severely limits our ability to deliver both high utilization and rapid pod starting.

One option is to make the pod creator voluntarily set a time limit on the command, visible in the pod API object. And then, later, we can expose the costs of picking a long time to users.

There will need to be a timeout. As Brendan pointed out, even systems like Android have timeouts.

In practice, different applications need different timeout values. Data processing workers with fine-grain shards may be able to complete their work and terminate in seconds. A stateless, load-balanced server may need to wait 2 readiness probe cycles to ensure the load balancer has redirected traffic. A database with an asynchronously updated backup may need a couple minutes to fully sync its state. A sharded service may need tens of minutes to spin up a replacement. A storage server may need hours to offload its data.

One could allow the timeout to be directly configurable. However, this could create priority inversion in a system where multiple levels of quality of service are supported. Observe in the examples above that the amount of time required to drain correlates strongly with possible quality of service levels. I recommend setting the timeout according to availability QoS level.

HOWEVER, users want different timeout values in different situations. For instance, they won't want to wait an hour if they just want to take down their service. We'll need all user-initiated operations that effectively stop containers to accept timeout values, which should override the default (assuming that they're smaller than the default).

brendandburns · 2014-08-26T00:02:28Z

comments (mostly) addressed. ptal

thanks
--brendan

brendandburns · 2014-08-26T03:13:01Z

now actually pushed the commit... sigh

bgrant0607 · 2014-08-26T16:37:38Z

pkg/kubelet/kubelet.go

@@ -320,12 +324,24 @@ func (kl *Kubelet) runContainer(pod *Pod, container *api.Container, podVolumes v
 		Binds:        binds,
 		NetworkMode:  netMode,
 	})
+	if kl.lifecycle != nil {
+		if output, err := kl.lifecycle.PostStart(dockerContainer.ID); err != nil {


This is a different "err" than the one used to return the result of StartContainer? If so, that's confusing.

OTOH, it seems like we should convey failure of a hook to the user somehow. What use cases do we imagine for PostStart hooks? Registration with external discovery mechanisms (e.g., worker registration)? If this were a PreStart hook, which might be used to initialize data in the container, I'd abort starting the container on hook failure, or at least provide the option to do that. (Which I suppose implies we'd need the user to request the hook in the API.)

Have you thought about a mechanism to return the output to the user? Probably we need an event stream they can listen on. In this case, it would be nice to be able to get structured data back, so we could grow this into a real API. Could we require the output to be YAML?

Similarly, I'd like to pass structured information to the hook, so that we can grow it into a real API. In other threads, we've discussed environment variables, but YAML seems better.

Yes, err is shadowed. I assumed that was intentional?

There's chatter about making our api not take YAML; I'd prefer to be consistent and use JSON for structured info.

I was going to suggest we have a unifying proposal for info that travels from the kubelet to the cluster, which includes start/stop events as well as results of lifecycle hooks, once Dawns restart work was up.

@smarterclayton I think "kubelet -> pod" info path is a totally different story from the "kubelet -> rest of cell" path.

@lavalamp I was referring to hook output back to the user, I was assuming that as a client if I want to see hook results I'd need to be able to identify the pod that specified it, but that I probably want hook info to be historically available (it's more like log results than snapshot info). So I was assuming that this lived somewhere between the "this died and got restarted" restart info of Dawn's item and the "how do I store, ship, aggregate, and search/view logs for a particular pod" work. Didn't seem like it fits on either extreme.

@smarterclayton yeah, kubelet needs to log its actions per pod and be able to regurgitate that on request. I think that's larger than this PR though.

For sure - I think that's #285 at least in part

thockin · 2014-08-26T18:16:03Z

I'm going to try not to get stuck in the code, and instead talk about the design. I think this is one of those things that would really benefit from a design doc that gets committed, and that the discussion would be better on a doc than on the code.

I'm 100% in favor of the idea, but this feels inside-out to me. Maybe my mental model is over-complex, but I'm going to write it down anyway.

I expected to see something like the API expanding:

Manifest: {
  Containers: [
      {
          image: whatever
          lifecycle: {
              postStart: {
                  exec: { cmd: "/opt/myapp/post-start" }
              }
              preStop: {
                  signal: { signum: "SIGTERM" }
              }
          }
      }
  }]
}

Specifically, that the user specifies the events they want delivered and how to deliver them (exec, signal, http GET/POST, etc).

Hardcoding the exec paths makes this very specific to k8s -- I think we want to push this as a semi-standard way of doing lifecycle, even on existing containers.

lavalamp · 2014-08-26T18:49:42Z

+1 to sticking this in the api.

bgrant0607 · 2014-08-26T21:29:53Z

Ref. #140, just to connect the PR to the issue.

brendandburns · 2014-08-26T21:33:00Z

Ok, I'm trying this the other way. I opened a PR for the API types: #1041

I'll close this for now, and we can resume discussion there.

Add initial lifecycle and hooks into kubelet.

19fd754

lavalamp reviewed Aug 25, 2014
View reviewed changes

Update to reflect comments

1d2c7b5

brendandburns force-pushed the minion branch from 54bb69b to 1d2c7b5 Compare August 26, 2014 03:12

bgrant0607 reviewed Aug 26, 2014
View reviewed changes

brendandburns closed this Aug 26, 2014

bgrant0607 mentioned this pull request Aug 26, 2014

Add types for lifecycle events. #1041

Merged

brendandburns deleted the minion branch August 7, 2015 04:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add initial lifecycle and hooks into kubelet. #1028

Add initial lifecycle and hooks into kubelet. #1028

brendandburns commented Aug 25, 2014

lavalamp Aug 25, 2014

brendandburns Aug 25, 2014

brendandburns commented Aug 25, 2014

lavalamp Aug 25, 2014

brendandburns Aug 26, 2014

lavalamp Aug 26, 2014

brendandburns Aug 26, 2014

lavalamp Aug 26, 2014

lavalamp Aug 26, 2014

brendandburns Aug 26, 2014

lavalamp Aug 26, 2014

erictune Aug 26, 2014

bgrant0607 Aug 26, 2014

brendandburns commented Aug 26, 2014

brendandburns commented Aug 26, 2014

bgrant0607 Aug 26, 2014

lavalamp Aug 26, 2014

smarterclayton Aug 26, 2014

lavalamp Aug 26, 2014

smarterclayton Aug 26, 2014

lavalamp Aug 26, 2014

smarterclayton Aug 26, 2014

thockin commented Aug 26, 2014

lavalamp commented Aug 26, 2014

bgrant0607 commented Aug 26, 2014

brendandburns commented Aug 26, 2014

Add initial lifecycle and hooks into kubelet. #1028

Add initial lifecycle and hooks into kubelet. #1028

Conversation

brendandburns commented Aug 25, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brendandburns commented Aug 25, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brendandburns commented Aug 26, 2014

brendandburns commented Aug 26, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thockin commented Aug 26, 2014

lavalamp commented Aug 26, 2014

bgrant0607 commented Aug 26, 2014

brendandburns commented Aug 26, 2014