Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Init KEP to migrate in-tree dockershim to out-of-tree #866

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
158 changes: 158 additions & 0 deletions keps/sig-node/20190226-migrate-in-tree-dockershim-to-out-of-tree.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
---
title: Separating a CRI for docker from Kubelet
authors:
- "@resouer"
- "@dims"
- "@zhangxiaoyu-zidif"
owning-sig: sig-node
reviewers:
- "@yujuhong"
- "@dchen1107"
- "@derekwaynecarr"
- "@PatrickLang"
approvers:
- "@DawnChen"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/@DawnChen/@dchen1107?

- "@yujuhong"
creation-date: 2019-02-26
last-updated: 2019-02-26
status: provisional
---

# Separating a CRI for Docker from Kubelet

## Table of Contents

- [Terms](#terms)
- [Summary](#summary)
- [Motivation](#motivation)
* [Pros](#pros)
* [Cons](#cons)
* [Goals](#goals)
* [Non-Goals](#non-goals)
- [Proposal](#proposal)
* [Dockershim deprecation plan](#dockershim-deprecation-plan)
* [Dockershim deprecation criteria](#dockershim-deprecation-criteria)
* [Test Plan](#test-plan)
* [Graduation Criteria](#graduation-criteria)
- [Implementation History](#implementation-history)

## Terms

- **CRI:** Container Runtime Interface – a plugin interface which enables kubelet to use a wide variety of container runtimes, without the need to recompile.


## Summary

CRI for docker (i.e. dockershim) is currently a built-in container runtime in kubelet code base. This proposal aims at a concrete deprecation and migration plan for separating dockershim from kubelet to out-of-tree without breaking current production users and WIP engineering efforts.

## Motivation

In Kubernetes, CRI is the used as the "default" container runtime, while currently the CRI of docker (a.k.a. dockershim) is part of kubelet code and coupled with kubelet's lifecycle.

This is not ideal as kubelet then has dependency on specific container runtime which leads to maintenance burden for not only developers in sig-node, but also cluster administrators when critical issues (e.g. runc CVE) happen to container runtimes. The pros of moving dockershim to out-of-tree is straightforward:

### Pros
- Docker is not special and just a CRI just like every other CRI in our ecosystem.
- Currently, dockershim "enjoys" some backdoors for various reasons. Deprecating these "features" should eliminate maintenance burden of kubelet.
- A cri-dockerd can be maintained independently.
- Over time we can remove vendored docker dependencies in kubelet.

Having said that, cons of deprecation built-in dockershim requires lots of attentions:

### Cons
- Deployment pain with a new binary in addition to kubelet.
- An additional component may aggravate the complexity currently. It may be relieved with docker version evolutions.
- The number of affected users maybe large.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be more specific about how this affects the users.

- Users must change existing use experience when using Kubernetes and docker.
- Users have to change their existing workflows to adapt to this new changes.
- And other unrecorded stuff.
- Updating all the eco-system tools to support the new cri-dockerd.
- Many people use the built in dockershim for in-cluster image build. While that may not be something we recommend for a variety of reasons, it will be a breaking change for these users.
- CRI is still in alpha,should probably get a 1.0 out there splitting out dockershim completely from kubelet.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I strongly support progressing CRI to v1, for the sake of all the runtime implementations. I'm less clear on the benefit of extracting dockershim to a standalone CRI. Did you consider deprecating dockershim in favor of existing CRI implementations instead?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try to confirm if sig-node or Docker Inc has a plan to support a CRI-docker for long time (which is what I heard during the sig meeting). If not, deprecating dockershim directly will be the goal.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked about this at SIG Node, including the folks from Docker Inc. Docker folks plan to maintain cri-dockershim.

- Existing CNI and CSI plugins may also be affected.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you be more specific about how these plugins are going to be affected?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For one, every container runtime generates CNI calls slightly differently. It might be nice to standardize (even if not officially) on one of the k8s cni wrappers, be it oci-cni or the existing dockershim networking plugin.

Also, is this the time to finally decomission kubenet?

- Current dockershim has independent module interacting with CNI plugins. After migrating dockershim out of Kubelet, it may affect some processes between dockershim and CNI plugins.
- cri-dockerd will vendor kubernetes/kubernetes, that may be tough.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For cri-dockerd, I think you should also include the memory overhead for separating dockershim out from kubelet.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, independent software need extra resource allocation.

- cri-dockerd as an independent software running on node should be allocated enough resource to guarantee its availability.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cri-dockerd will still be vendoring kubernetes/kubernetes for a while during the transition. vendoring kubernetes/kubernetes is not easy

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK, it's a good point

> You can check [the discussion in sig-node mailing list](https://groups.google.com/forum/#!msg/kubernetes-sig-node/0qVzfugYhro/l6Au216XAgAJ) for more details.

Based on all the discussion, we agree that we should not rush to immediate decision. At the same time, it's the right time to start designing and documenting dockershim deprecation criteria and plan, which will be the main content of rest of this KEP.

### Goals

- A concrete dockershim deprecation criteria.
- A brief plan to deprecate dockershim spanning multiple releases.

Copy link
Member

@dims dims Mar 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add another goal of writing up a criteria for adopting one of the existing CRI runtimes (for example containerd) that can eventually replace cri-dockerd?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

considering that there are too many docker users, it be better not to cease cri-dockerd here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SIG Node doesn't want to make a call which CRI implementation the vendor / the user should use. Instead, we design the API to define how the runtime engaging with Kubelet and nodes, a test suite to verify / qualify the functionalities, a portable toolset (cri-tool) for debuggability and introspect.

### Non-Goals

- Deprecation of dockershim immediately without consideration for users and WIP efforts depending on it.
- Refactoring or re-design of dockershim itself due to deprecation.

## Proposal

### Dockershim deprecation criteria

- CRI itself is beta.
- kubelet has no dependency on dockershim/docker in its whole lifecycle.
- All node related features are CRI generic and have no "back door" dependency on dockershim/docker.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deprecate and remove, or replace all Docker-specific features.

- Deprecate and remove, or replace all Docker-specific features.
- Reasonable benchmark result of performance degradation after moving dockershim to out-of-tree.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What percentage of degradation is acceptable (10%) ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10% might not be acceptable by the wider audience.
but my guess is that it's going to end up being less than that 🤔

- A out-of-tree CRI for docker is implemented and well maintained, and become to beta.
- E2E test framework has been updated with fully support of out-of-tree CRI container runtime.

### Dockershim deprecation plan

Step 1: Stabilize in-tree dockershim and decouple dockershim from kubelet (but still in-tree).

Target releases: 1.15, 1,16, 1.17

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

target release is v1.20 now. 😄


Actions:

- Mark in-tree dockershim as "maintenance mode":
- CRI generic changes/features can continue on dockershim.
- WIP efforts on dockershim can continue and go to complete.
- dockershim/docker specific changes/features should be rejected.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we will continue updating the docker/docker dependency. right?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps continue updating the docker/docker dependency is a must.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we still need to update Docker when needed.

- Deprecate the legacy features of dockershim in kubelet by providing a specific timeline. Currently, kubelet still has:
- vendored dockershim
- flags that are used to configure dockershim.
- support to get container logs when docker uses journald as the driver.
- logic of moving docker processes to a given cgroup
- TBD anything else?
- Package in-tree dockershim is separated from kubelet and provide a "option" to enable/disable it. And the original in-tree dockershim will be remained there currently and depreciated gradually.

- Ensure e2e/Node e2e test framework is CRI generic and test cases are independent of container runtime.

Step 2: Work out a out-of-tree CRI for docker
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this not already exist in the form of containerd?

Copy link
Author

@resouer resouer Apr 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume they are different. During recent sig-node meeting, folks from Docker Inc are willing to maintain a CRI-docker.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All recent versions of docker are built on top of containerd, so I'm unclear on what functionality dockershim has that cri-containerd does not.

@Random-Liu do you know if there is any reason to use dockershim over cri-containerd with recent versions of docker?

Copy link
Member

@Random-Liu Random-Liu Apr 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in the long run, we do expect people to move to CRI conformant container runtimes like containerd and cri-o to be able to leverage new Kubernetes features.

However, people may still need more time to make that move, one of the biggest reason is that with dockershim, you can actually see all your Kubernetes containers in Docker, and people's tools may have been relying on that. For example, people may have built security tools, monitoring tools, logging tools around that.

With containerd or cri-o, you won't see Kubernetes containers in Docker any more. It takes time for people to migrate their tooling, that's why we need to keep dockershim for a while.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another reason is that today's Kubernetes windows support is still based on dockershim. cri-containerd Windows is WIP, but not finished yet.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we cloud remain both mechanisms currently and depreciated the old way gradually.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there are still features that are supported by docker/dockershim, but not with containerd.


Target releases: 1.18

Actions:

- Design & implement a out-of-tree CRI for docker, it can be "copied" from dockershim as beginning.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want to start a new repo before eliminating the in-tree dockershim completely? What benefits does this bring when we have to maintain two copies for multiple releases?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish we could stop shipping new features (except critical bugfix) to build-in dockershim as early as possible, we could mark it as "deprecated" status and lock to specific version, in this case, new CRI functions should return "Not implemented".

Otherwise, I'm afraid the deprecating process would be blocked by upcoming features in dockershim.

- Re-direct dockershim related features/changes to this out-of-tree CRI for docker.


Step 3: Completely deprecate in-tree dockershim from kubelet.

Target releases: TBD, we probably need to continue keeping in-tree dockershim for 3 releases as grace period.

Actions:

- Refactoring e2e/Node e2e test framework to include CRI for docker installation (or use other CRI container runtime).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would be extremely helpful for existing CRI impls to test/verify compatibility

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cluster/node e2e doesn't care what CRI implementation you use under the hood. I think maybe I've missed the point of this statement.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There're two points in my mind:

  1. Ensure cluster/node e2e are 100% CRI focused. For instance: DockerValidator should use the CRI, not the Docker API kubernetes#55414
  2. Ensure test-infra install CRI-docker or cri-containerd binary in e2e machines. Currently, they install Docker binary only.

- Ensure cluster/node e2e are 100% CRI focused.
- Ensure test-infra install CRI-docker or Containerd binary in e2e machines. Currently, they install Docker only.
- Document and announce migration guide.
- Delete in-tree dockershim code from kubelet after certain "grace period".


### Test Plan

_To be filled until targeted at a release._

### Graduation Criteria

_To be filled until targeted at a release._

## Implementation History

- 2019-02-28: Initial KEP sent out for discussion & reviewing.