Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2.0: Consider daemonization #821

Open
squeed opened this issue Mar 24, 2021 · 10 comments
Open

2.0: Consider daemonization #821

squeed opened this issue Mar 24, 2021 · 10 comments
Labels

Comments

@squeed
Copy link
Member

squeed commented Mar 24, 2021

Background

The exec-based API can be awkward for a few reasons:

  • Many plugins need a daemon anyways
  • Writing binaries to disk is a security risk
  • Executing binaries in a containerized environment is fragile (not everything can / should be statically linked)

Considerations

  • Not all CNI runtimes are daemons, so this would need to be optional (or we continue supporting 1.0 indefinitely)
  • Golang and namespaces is still ripe for bugs
  • Should we use gRPC?
  • If a plugin is used for more than one "network", does that mean multiple processes, or should plugins support "multitenancy"?

Advantages

  • Daemonized plugins could support additional lifecycle / methods
    • garbage collection
    • initialization
  • Easier to deploy in as a container

Bidirectionality?

Would we want plugins to be able to send events to the runtime? Or should we still model this as a request-response protocol?

@squeed squeed added the CNI 2.0 label Mar 24, 2021
@robberphex
Copy link

The CNI is an interface. So in CNI, we could just define an interface and some methods, and plugin/runtime interaction. The CNI server could be implemented by others(containerd I think).

I think CNI should be defined via gRPC, to be called by CRI, or others. For plugins, call via envvar is good, I think we can still use it in 2.0.

@danwinship
Copy link
Contributor

Executing binaries in a containerized environment is fragile (not everything can / should be statically linked)

Citation required? It seems to me that the problems with dynamic linking are entirely in the opposite direction; if you copy a dynamically-linked binary from your container to the root disk, it may not be able to find the shared libraries it needs any more.

If you're talking about openssl FIPS compliance stuff specifically, I don't think that's a problem that can be solved at the CNI level; if a particular distribution (eg OCP) or cluster-infrastructure provider (eg EKS) needs to require specific things of all CNI plugins running in that environment, then either they need to build all the CNI plugins themselves, or they need to bless specific container images containing tested/approved CNI plugins. I don't see any way of getting around that.

Golang and namespaces is still ripe for bugs

To the best of my knowledge there is currently exactly one known bug, which is that if a goroutine running in a namespace spawns off a new goroutine, it is not guaranteed that the new goroutine will remain in the same namespace. Which seems like something golang needs to provide some fix for (though it will be complicated since sometimes you do want the current semantics).

Not all CNI runtimes are daemons, so this would need to be optional (or we continue supporting 1.0 indefinitely)
Easier to deploy in as a container

It would be great if CNI 2.0 required all plugins to be run from containers, even if they aren't daemons, and did not require containers to make any modifications to the root disk.

@squeed
Copy link
Member Author

squeed commented Mar 22, 2022

Executing binaries in a containerized environment is fragile ...

Citation required? It seems to me that the problems with dynamic linking are entirely in the opposite direction; if you copy a dynamically-linked binary from your container to the root disk, it may not be able to find the shared libraries it needs any more.

Yes, good point. I wrote it backwards. s/executing/leaking-from-a-container/

Golang and namespaces is still ripe for bugs

To the best of my knowledge there is currently exactly one known bug ...

Agreed, that is my understanding as well. I still don't love doing something that is, at best, unofficially supported by the runtime.

Not all CNI runtimes are daemons, so this would need to be optional (or we continue supporting 1.0 indefinitely)
Easier to deploy in as a container

It would be great if CNI 2.0 required all plugins to be run from containers, even if they aren't daemons, and did not require containers to make any modifications to the root disk.

If we switch to gRPC, then the "interface" is a socket file, making no statement as to how plugins are executed (and thus enabling containerization). However, if we eschew gRPC and stick with an execution-based protocol, it would be interesting if the CNI flow included, somehow, the container runtime engine executing more containers (i.e. plugins) on behalf of CNI...

Of course, many plugins need to store some state between invocations, so there is always that concern. But that is somewhat incidental.

@kfox1111
Copy link

To the best of my knowledge there is currently exactly one known bug ...

Agreed, that is my understanding as well. I still don't love doing something that is, at best, unofficially supported by the runtime.

Got a link for posterity?

@kfox1111
Copy link

Advantages

  • Daemonized plugins could support additional lifecycle / methods

    • garbage collection
    • initialization
  • Easier to deploy in as a container

I'd add * more consistent/predictable resource utilization
You can reserve it in the container and its always allocated to the daemon rather then try and reserve it on the host and watch the hosts memory fluctuate.

@danwinship
Copy link
Contributor

It would be great if CNI 2.0 required all plugins to be run from containers, even if they aren't daemons, and did not require containers to make any modifications to the root disk.

(I meant "did not require plugins to make any modifications")

If we switch to gRPC, then the "interface" is a socket file, making no statement as to how plugins are executed (and thus enabling containerization). However, if we eschew gRPC and stick with an execution-based protocol, it would be interesting if the CNI flow included, somehow, the container runtime engine executing more containers (i.e. plugins) on behalf of CNI...

Yes, that's what I meant. I mean, obviously the runtime knows how to run a binary out of a container, so...

Of course, many plugins need to store some state between invocations, so there is always that concern. But that is somewhat incidental.

I'm fine with plugins being allowed to write to the root disk, I just don't think CNI should require plugins to write to the root disk. ie, it should not require you to copy out a binary and it should not require you to write out a config file.

@danwinship
Copy link
Contributor

To the best of my knowledge there is currently exactly one known bug ...

Agreed, that is my understanding as well. I still don't love doing something that is, at best, unofficially supported by the runtime.

Got a link for posterity?

I don't have a link but I can explain the problem.

There's a golang routine (runtime.LockOSThread()) that lets you say "I'm doing something weird with this thread, and this goroutine needs to stay on that thread until I'm done". So github.com/containernetworking/plugins/pkg/ns uses that, so when you do netns.Do(func() { ... }) it calls runtime.LockOSThread() and then syscall.Setns() and then runs your function.

Long ago, people realized that this caused bugs, because although LockOSThread prevented the current goroutine from being moved to another thread, it didn't prevent other goroutines from being scheduled to this thread if your goroutine got blocked. So this caused awful problems where a goroutine's network namespace would randomly change. This eventually got fixed.

But more recently people discovered that the opposite problem exists; if you spawn off a new goroutine from your locked thread, that goroutine won't stay on the same thread. In particular, if you do netns.Do(func() { net.Dial("tcp", "dual-stack-host.example.com") }), then net.Dial will spawn goroutines to try IPv4 and IPv6 connections in parallel, but those goroutines will end up running in the main netns, not the one you were trying to use.

And this is not an unambiguous bug like the previous problem, because goroutines are ubiquitous and there are lots of circumstances where you wouldn't want the goroutine to inherit the special behavior from the original thread. So any fix for this is likely to require API changes of some sort (?) and it's not clear any fix is coming any time soon.

@danwinship
Copy link
Contributor

(And you can't just say "well don't use the 'unsafe' functions" because there's no way to know from the outside which functions are unsafe. It's entirely possible that plugins that work today would break in the future if additional goroutines were added to functions in "exec" or "github.com/vishvananda/netlink".)

(Ugh.)

@nevermore-muyi
Copy link

Add an advantage: in my environment, i deploy 2000 nodes and 40,000 pods.

  1. cni request need get pod info and network-attachment-definitions
  2. cni request faile always
  3. cni request will retry and fail again
  4. cycle...

So, in sometime may many nodes retry cni request and send request to kube-apiserver, and get 429 response from kube-apiserver. If deployed in daemonization, we can limit every node's request by client-go's throttles and reduce kube-apiserver's pressure.

@yuval-k
Copy link

yuval-k commented Sep 18, 2023

how would plugin delegation work? i.e. something like ptp calling host-local for IPAM.
will the daemonized plugins invoke make an RPC call to the cni runtime to invoke plugins?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants