KEP-3786: Migrate Shared Kube-Proxy code into staging #3788

astoycos · 2023-01-26T22:18:41Z

One-line PR description: Continuation of [WIP do not merge] Kube-Proxy Library: Breakout a KPNG-like interface "kube proxy lib" from #2104 #3649 with it's own issue.

Issue link: Migrate Shared Kube-Proxy Code into Staging #3786

Other comments:

This KEP is the outcome of the exploration into "what comes next" for the Kubernetes kube-proxy. It began with the successful large scale effort completed within the kpng project, and evolved out of the fact that KPNG as it stands today is simply too large to ever be merged back in tree. With this smaller and directly implementable kep we hope to start pushing kube-proxy into the future in a stable and incremental manner while also building on the lessons learned with kpng.

Adding details about how proxies work w/ and w/o the fullstate initial wave of fixes from dan's review updaedd diffstore explanation and typo fix, moved it updates to userstories and fixes to DiffState bundling logic removed exsiting Run() logic, Fix some spelling and typos Add section on the different types of client sinks. Signed-off-by: astoycos <astoycos@redhat.com> Add note about separate node kpng-brain and its fragility as a real feature Update README.md Refactoring Kube-Proxy, mega-commit Co-authored-by: mcluseau <mikael.cluseau@gmail.com> Co-authored-by: jayunit100 <jvyas@vmware.com> Co-authored-by: kbabben <knabben@vmware.com> Co-authored-by: rajas kakodkar <rkakodkar@vmware.com>

…paration of the APIServer uplink

Co-authored-by: Dan Winship <danwinship@redhat.com>

This update provides some fixes and review comment resolutions from Dan. It also clarifies where and what the proposed library would look like and the role it would initially serve for existing proxies. Also fix verification tests :) Signed-off-by: Andrew Stoycos <astoycos@redhat.com>

Signed-off-by: Andrew Stoycos <astoycos@redhat.com>

k8s-ci-robot · 2023-01-26T22:18:49Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: astoycos
Once this PR has been reviewed and has the lgtm label, please assign thockin for approval by writing /assign @thockin in a comment. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

keps/sig-network/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

astoycos · 2023-01-26T22:27:06Z

/assign @jayunit100 @thockin @danwinship

danwinship · 2023-01-27T14:02:09Z

(you should squash the commits)

uablrek · 2023-01-29T08:36:11Z

I really support getting kpng into staging. Great initiative @astoycos

I don't see kpng as a replacement for the current (and very stable) proxiers, but as a tool for writing specialized proxiers. But as a part of staging it must be tested in the "standard" K8s e2e tests. I can't see an obvious way to do that without at least one backend that actually can replace the current proxiers. If that would be the case, I propose the nft backend for Linux.

Another way would be if the CNI-plugins that replaces kube-proxy today, for instance Cilium, Calico, Antrea, Kube-router, (more?), would be convinced to use the kpng libraries and test kpng in their environments.

aojea · 2023-01-29T11:08:20Z

But as a part of staging it must be tested in the "standard" K8s e2e tests

that is not true, that only happens if you add tests for them and you maintain them, if the rationale is that the code will go into staging and magically it will be tested and fixed by the community I have to give you the bad news that is not going to happen ... you'll have to set up your jobs , configure them, and watch them ... and fix your flaky and non working tests

aojea · 2023-01-29T11:12:27Z

keps/sig-network/3786-migrate-shared-kube-proxy-code-to-staging/README.md

+
+## Alternatives
+
+We could retain the existing kube proxy shared code in core kubernetes and simply better document the data structures and golang maps used for kubernetes client operations and client side caching. However, that would still require external proxy implementations to copy and paste large amounts of code.  The other option is to not tackle this problem in-tree and to move forward with the singular development of external projects like KPNG as the overall framework for solving these problems.  The drawbacks to this include the large additional maintenance burden, and that it is opinionated towards a raw GRPC implementation and other users (i.e. XDS) want something more decoupled possibly.  This realization has inspired this KEP.


the maintenance of a shared library is much complex in-tree than out-of-tree, see client-go per example

I completely agree, maybe I used the term library a little bit prematurely here :)

This KEP was supposed to just focus on movement of code... Not really development of a library (yet) however I wanted it to be clear that moving the relevant shared bits to staging would be the first step in exposing the shared bits.

Unlike client-go we would make no true guarantees about this code or provide versioning semantics , it would just be much easier for folks to vendor essentially

aojea · 2023-01-29T11:15:51Z

I don't think that in-tree is the right solution for a shared library, see client-go problems with skewed versions.

aojea · 2023-01-29T14:43:47Z

keps/sig-network/3786-migrate-shared-kube-proxy-code-to-staging/README.md

+## Summary
+
+After about a year and a half of testing a [new service proxy implementation](https://github.com/kubernetes-sigs/kpng/), and
+collecting sig-network and community feedback, it is clear that a shared library (referred to as the kube-proxy staging library in this document) designed to make building new service proxies easier is needed. Specifically, it is desired by members of the Kubernetes networking community who are attempting to build specialized networking tools. However, in order to prevent the community from having to maintain


it is clear that a shared library (referred to as the kube-proxy staging library in this document) designed to make building new service proxies easier is needed.

There are multiple service proxies already out there, calico, cilium, ovn-kubernetes, ... can we expand why this need is clear and what is the intended audience?

So a lot of Antonio's comments basically boil down to "this doesn't make sense if you haven't been paying attention to kpng". (So the KEP needs to be keeping that in mind more.)

Answering this question directly: people writing alternative proxy implementations need to reimplement a lot of kube-proxy's logic, and there's really no good reason for this. Kube-proxy already has shared abstractions that are used by iptables, ipvs, and winkernel. We could split those out so that other implementations can use them too.

people writing alternative proxy implementations need to reimplement a lot of kube-proxy's logic, and there's really no good reason for this.

Indeed, this seems to be one of the more clear and obvious reasons for this redux. @astoycos it would definitely seem that in the summary here we should either add more specificity about the specific projects which have had to "work around" kube-proxy, or at least point out that we will have later sections in this document which will talk about those and create them, so that this is more clear to readers of the KEP.

So a lot of Antonio's comments basically boil down to "this doesn't make sense if you haven't been paying attention to kpng".

Right I definitely wanted to make this kep take the learnings from KPNG and play them forward rather than leaving them all behind, let me see if I can add more context to help.

it would definitely seem that in the summary here we should either add more specificity about the specific projects which have had to "work around" kube-proxy, or at least point out that we will have later sections in this document which will talk about those and create them, so that this is more clear to readers of the KEP.

Agreed let me see what I can do here, I think it will be super informative for me to dig in and see if any existing service proxies have custom solutions for some of this shared functionality.

Answering this question directly: people writing alternative proxy implementations need to reimplement a lot of kube-proxy's logic,

@danwinship this is the part I struggle, what part? the backends? because that is very specific of the technologie, iptables, ipvs, ovs are never going to be fully compatible, we have a considerable number of cases that ipvs and iptables behave different just because of how they work

We could split those out so that other implementations can use them too.

If we talk about the control-plane logic, that was the intention of the kep that kicked out kpng, I'm 100% onboard with that project, we should have a control plane that people can use, I always put the envoy control plane as an example of what we should do https://github.com/envoyproxy/go-control-plane

Also, do you think that current cache logic should be shared? it doesn't make use of the informer cache per example, and builds another cache on top, with complex apply and merge operations ... I think that can be much simpler https://github.com/aojea/networking-controllers/blob/main/pkg/services/services_controller.go

what part?

pkg/proxy, bits of pkg/proxy/healthcheck and pkg/proxy/utils, ...

change tracking / caching / EndpointSlice merging

topology handling

HealthCheckNodePort implementation

--nodeport-address handling

conntrack cleanup

...

Also, do you think that current cache logic should be shared?

We would start with the current code because that's what we have that works, but we should definitely look at improving it too. There are definitely problems with it.

Great, it seems we are on agreement on what are the problems.

I still think that building a new library is much better than carrying with existing code that has a lot of assumptions about iptables details, I liked @thockin thoughts he presented in https://docs.google.com/presentation/d/1Y-tZ4fFC9L2NvtBeiIXD1MiJ0ieg_zg4Q6SlFmIax8w/edit?usp=drivesdk&resourcekey=0-SFhIGTpnJT5fo6ZSzQC57g

aojea · 2023-01-29T14:48:26Z

keps/sig-network/3786-migrate-shared-kube-proxy-code-to-staging/README.md

+
+After about a year and a half of testing a [new service proxy implementation](https://github.com/kubernetes-sigs/kpng/), and
+collecting sig-network and community feedback, it is clear that a shared library (referred to as the kube-proxy staging library in this document) designed to make building new service proxies easier is needed. Specifically, it is desired by members of the Kubernetes networking community who are attempting to build specialized networking tools. However, in order to prevent the community from having to maintain
+two separate sets of code (the existing kube-proxy code and the aforementioned new library) while also ensuring the stability of the existing proxies, it makes sense to work incrementally towards this goal in the kube-proxy


This is just the opposite, maintaining current code is straightforward, there are no external consumers, only a few persons are doing this right now and it works because there is no much churn on the code.

However, maintaining a new library that does the same that current code is much more difficult and it will be not be able to be maintained with current model, we'll need much more people involved, this will be need to set up new infra, deal with library users bugs, versioning and backwards compatibility, ...

The idea (or my idea anyway) is something like apimachinery or component-helpers. It is not a fully-independent library that is expected to go off and live its own life and gain independent usage. It is just a part of k/k that we wanted to make easier for some people to vendor.

something like apimachinery or component-helpers

Or perhaps more relevantly, #3686

@aojea my take on your concerns is that you don't want to see this library becoming some common library in a large number of implementations, is that correct? If that is correct: it is my general understanding with KPNG that the main thing it wants to provide to third party implementations are interfaces and common tools such that software like Cilium can provide it's own implementation without being so disparate from upstream. Do you feel it may be possible to strike a balance here that you'd be comfortable with if we very explicitly defined what aspects of the library are intended for outside usage, so that we can provide the essentials and keep it at that? Do you think that would help?

The idea (or my idea anyway) is something like apimachinery or component-helpers. It is not a fully-independent library that is expected to go off and live its own life and gain independent usage. It is just a part of k/k that we wanted to make easier for some people to vendor.

@aojea This is exactly what I was thinking, and even more-so for this kep the first and most important goal would simply be moving of some of the shared proxy code to staging (without any alterations) very similar to #3686 :)

This is proposed in kubernetes/kubernetes#92369, which is kind of the birth place for kpng. Check the first comment from @mcluseau a bit down 😃

So is this shared code going to be developed in lock-step with Kubernetes, i.e. do you need to make changes for the code and Kubernetes in a single commit? That would be a strong argument that it needs to be a staging repo.

If nothing in Kubernetes uses the code, then a separate repo is clearly a better solution.

If Kubernetes uses the code, but could vendor new releases, then it's less clear. E2E testing can also be set up for the code when it is in a separate repo, it just might be a bit more work.

The initial proposal is to move it to staging. As long as k/k continues to build a kube-proxy binary, it is likely that things will stay in staging.

aojea · 2023-01-29T14:49:09Z

keps/sig-network/3786-migrate-shared-kube-proxy-code-to-staging/README.md

+staging library.
+
+This KEP describes the **first step** towards the ultimate end goal of providing
+a shared library of core kube-proxy functionality which can be easily consumed by the community.


This can be done out of tree and it will be easier to consume

I tend to agree with @aojea here. My sense is there is desire for an in-tree implementation as there is some sense that being out-of-tree may lead to the decline of kpng and therefore loss of value, is that accurate @astoycos, or am I misinterpreting?

So KPNG makes it simple-er to 1. write new backends and 2. deploy in some really smart/diverse scenarios however it locks you into a BUNCH of added complex machinery to do so, and as we've seen it is difficult to maintain.

IMO what folks really want is just 1. (e.g the ability to write new service proxies quickly) and the core bits to that (k8s client / client side caching, etc) are already there being used by the core proxies internally.

My sense is there is desire for an in-tree implementation as there is some sense that being out-of-tree may lead to the decline of kpng and therefore loss of value

Partly :) I don't think KPNG will "decline" but it's definitely not production-ready, where-as the core in-tree logic IS battle proven and production ready as is, so why not open it up for external consumption while also opening up the door for future improvements as well?

Ultimately if the answer is "The in-tree proxy code is stable and we intend to freeze/ maintain is AS-IS" I'm fine with that :), but if we want to pave a way for it to iterate into the future we need to start somewhere.

Thank you for the details. I believe I can personally be persuaded here, I would just like to see more of the motivation detailed out in the KEP.

aojea · 2023-01-29T14:54:01Z

keps/sig-network/3786-migrate-shared-kube-proxy-code-to-staging/README.md

+a shared library of core kube-proxy functionality which can be easily consumed by the community.
+Specifically, it will work to move the existing [shared kube-proxy code](https://github.com/kubernetes/kubernetes/tree/master/pkg/proxy) and associated network utilities into [staging](https://github.com/kubernetes/kube-proxy).
+
+This initial work will open the door for future kube-proxy code improvements to be made in both


What will be these improvements and in which areas?
kube-proxy intree has iptables and ipvs, ipvs is in maintenance mode since a long time with open issues that are not solved, thanks to Lars that stepped up is usable these days.
The "control-plane" logic of kube-proxy is practically the same for years.
The only "big improvements" in kube-proxy were iptables optimizations, most of them delivered by DanWinship.

So this is where I would plan on leaning heavily on the learning's from KPNG and it's community, I fore-see some possible optimizations in some of the following areas at this time:

The client-side caching mechanisms (e.g the serviceChangeTracker) were essentially converted to be B-Tree based in KPNG which could lead to some performance benefits (following much much more testing)

The ability to "artificially" test large cluster workloads in KPNG using static yaml files (dumped from real world clusters), if we could incorporate this for in-tree proxies it would help us test larger scale scenarios much more cheaply and easily.

Any optimizations it easier for the community to write new proxies with these shared bits (this could take many different paths) I know that's a bit of a "fake-one" but definitely relevant for the future.

I'm sure there's some I'm missing I guess the larger question really is do we want to iterate in-tree at all?

aojea · 2023-01-29T14:56:12Z

keps/sig-network/3786-migrate-shared-kube-proxy-code-to-staging/README.md

+A previous attempt at a broad solution to this problem was explored in the [KPNG project](https://github.com/kubernetes-sigs/kpng/), which exhibits many properties that allow for such goals to be accomplished.  However, because it introduced many new features and would result in large breaking changes if it were
+to be incorporated back in tree, it became clear we needed to decompose this large task into smaller pieces. Therefore, we've decided to keep things simple and start by moving the existing shared kube-proxy code into staging where it can be iterated on and augmented to in an safe, consumable and incremental manner.


Can we explain why this doesn't work using kpng as library for and why it will work moving this in-tree, where there are much more strict policies?

We can tell other people to use kpng as a library, but we can't make the in-tree proxies use it. (Because of the "much more strict policies". None of the kpng code has been reviewed (by sig-network-reviewers), and it has not seen nearly as much real-world usage as the existing kube-proxy code, so even if we could manage to review all of it, we couldn't be confident that it was sufficiently bug-free at scale.)

So if we want to help third-party proxy implementations, the options are: (1) continue to divide SIG Network resources between writing two separate proxy codebases, and tell third parties to use the one that we are not willing to use ourselves, or (2) try to make it possible for third parties to use our proxy base.

Agreed, more explanation of the motivation here would be helpful please 🙇

So if we want to help third-party proxy implementations

this is the claim I can't see a real evidence, are we saying that ovn-kubernetes, cilium, calico, antrea are going to use this library?

can we get some idea of "who" are these third-party implementations?

I think @danwinship Hit the nail on the head here, let me try to get some of his feedback in here.

this is the claim I can't see a real evidence, are we saying that ovn-kubernetes, cilium, calico, antrea are going to use this library?

The initial request came from Antrea and Calico are importing from k8s.io/kubernetes/pkg/proxy so yes, I think so.

But not if the library is out-of-tree!

oh, thanks, that is something important, can we add this to the KEP please?

aojea · 2023-01-29T15:05:28Z

keps/sig-network/3786-migrate-shared-kube-proxy-code-to-staging/README.md

+
+#### Story 3
+
+As a Kubernetes developer I want to make maintaining the shared proxy code easier, and allow for updates to that code to be completed in a more incremental and well tested way.


maintaining code and versioning in kubernetes/kubernetes is not easier, just the contrary.

There is also a misbelief that the testing comes for free, and that is not true, each SIG is responsible for maintaining, adding, monitoring and fixing its own tests, and this require a constant and periodic work, it can not be done sporadilly.

All the sig-network testing is maintained only by myself during last years, I tried to onboard more people, doing presentations, mentoring people 1on1s, but nobody sticks and people comes and goes leaving half baked tests, features or dashboards.
I'd like to see a plan and evidence that there will be more people involved on this work to avoid leaving the project in an undesired state

All the sig-network testing is maintained only by myself during last years, I tried to onboard more people, doing presentations, mentoring people 1on1s, but nobody sticks and people comes and goes leaving half baked tests, features or dashboards.

+1000 First of all thank YOU for all of this work :) I think everyone who is stable in the community is feeling some bit of burnout/ trouble with this (I know I am in the network-policy group)

So initially nothing would change with this move except for the location of the code. Unless I'm missing something....we wouldn't be providing any new versioning guarantees, and we'll run all of the existing tests against the same code (maybe this will require some new hacks?)

However now at least these bits can be used by external consumers, which IMO would hopefully lead to more involvement.

I'd like to see a plan and evidence that there will be more people involved on this work to avoid leaving the project in an undesired state

I totally agree maybe we can work on a plan here? At the root of it IMO if we expose functionality previously only exposed to core implementations, it would mean more users, which hopefully would also mean more folks involved.

t would mean more users, which hopefully would also mean more folks involved.

heh, I never seen that happen but I love to stand corrected here

we expose functionality previously only exposed to core implementations

that is my main fear, so we are going to have more features demands for niche use cases or to fix bug for implementation X that is not compatible with implementation Y

shaneutt

I am generally aligned with the spirit of KPNG and this KEP: I have some concerns about the future of the Kubernetes networking community if implementations continue to become more and more disparate and we're not building at least some of it in cooperation.

That said, I think there are some important concerns raised here that we should address:

more depth and clarity on what's at stake here, and what we think we'll lose if we don't move forward with something here. I'm kinda expecting the motivation section here to be fairly large before it's all said and done.
strong motivational clarity on why this should be in-tree OR not-in-tree,
details on the costs that will ensue, and plans for helping to control those costs (e.g. maintenance costs)

shaneutt · 2023-01-30T14:59:56Z

keps/sig-network/3786-migrate-shared-kube-proxy-code-to-staging/README.md

+## Summary
+
+After about a year and a half of testing a [new service proxy implementation](https://github.com/kubernetes-sigs/kpng/), and
+collecting sig-network and community feedback, it is clear that a shared library (referred to as the kube-proxy staging library in this document) designed to make building new service proxies easier is needed. Specifically, it is desired by members of the Kubernetes networking community who are attempting to build specialized networking tools. However, in order to prevent the community from having to maintain


people writing alternative proxy implementations need to reimplement a lot of kube-proxy's logic, and there's really no good reason for this.

Indeed, this seems to be one of the more clear and obvious reasons for this redux. @astoycos it would definitely seem that in the summary here we should either add more specificity about the specific projects which have had to "work around" kube-proxy, or at least point out that we will have later sections in this document which will talk about those and create them, so that this is more clear to readers of the KEP.

shaneutt · 2023-01-30T15:05:47Z

keps/sig-network/3786-migrate-shared-kube-proxy-code-to-staging/README.md

+
+After about a year and a half of testing a [new service proxy implementation](https://github.com/kubernetes-sigs/kpng/), and
+collecting sig-network and community feedback, it is clear that a shared library (referred to as the kube-proxy staging library in this document) designed to make building new service proxies easier is needed. Specifically, it is desired by members of the Kubernetes networking community who are attempting to build specialized networking tools. However, in order to prevent the community from having to maintain
+two separate sets of code (the existing kube-proxy code and the aforementioned new library) while also ensuring the stability of the existing proxies, it makes sense to work incrementally towards this goal in the kube-proxy


@aojea my take on your concerns is that you don't want to see this library becoming some common library in a large number of implementations, is that correct? If that is correct: it is my general understanding with KPNG that the main thing it wants to provide to third party implementations are interfaces and common tools such that software like Cilium can provide it's own implementation without being so disparate from upstream. Do you feel it may be possible to strike a balance here that you'd be comfortable with if we very explicitly defined what aspects of the library are intended for outside usage, so that we can provide the essentials and keep it at that? Do you think that would help?

shaneutt · 2023-01-30T15:10:36Z

keps/sig-network/3786-migrate-shared-kube-proxy-code-to-staging/README.md

+staging library.
+
+This KEP describes the **first step** towards the ultimate end goal of providing
+a shared library of core kube-proxy functionality which can be easily consumed by the community.


I tend to agree with @aojea here. My sense is there is desire for an in-tree implementation as there is some sense that being out-of-tree may lead to the decline of kpng and therefore loss of value, is that accurate @astoycos, or am I misinterpreting?

shaneutt · 2023-01-30T15:11:11Z

keps/sig-network/3786-migrate-shared-kube-proxy-code-to-staging/README.md

+A previous attempt at a broad solution to this problem was explored in the [KPNG project](https://github.com/kubernetes-sigs/kpng/), which exhibits many properties that allow for such goals to be accomplished.  However, because it introduced many new features and would result in large breaking changes if it were
+to be incorporated back in tree, it became clear we needed to decompose this large task into smaller pieces. Therefore, we've decided to keep things simple and start by moving the existing shared kube-proxy code into staging where it can be iterated on and augmented to in an safe, consumable and incremental manner.


Agreed, more explanation of the motivation here would be helpful please 🙇

shaneutt · 2023-01-30T15:11:31Z

keps/sig-network/3786-migrate-shared-kube-proxy-code-to-staging/README.md

+
+- Move the [shared kube-proxy code k/k/pkg/proxy](https://github.com/kubernetes/kubernetes/tree/master/pkg/proxy), and relevant
+networking utilities (i.e `pkg/util/iptables`) into the kube-proxy [staging repository](https://github.com/kubernetes/kube-proxy).
+- Ensure all existing kube-proxy unit and e2e coverage remains the same or is improved.


very important goal to emphasize 👍

shaneutt · 2023-01-30T15:12:11Z

keps/sig-network/3786-migrate-shared-kube-proxy-code-to-staging/README.md

+networking utilities (i.e `pkg/util/iptables`) into the kube-proxy [staging repository](https://github.com/kubernetes/kube-proxy).
+- Ensure all existing kube-proxy unit and e2e coverage remains the same or is improved.
+- Ensure the shared code can be easily vendored by external users to help write new out of tree service proxies.
+- Write documentation detailing how external consumers can utilize the kube-proxy staging library.


uablrek · 2023-01-30T15:38:33Z

that is not true, that only happens if you add tests for them and you maintain them, if the rationale is that the code will go into staging and magically it will be tested and fixed by the community I have to give you the bad news that is not going to happen ... you'll have to set up your jobs , configure them, and watch them ... and fix your flaky and non working tests

Sorry Antonio, I phrased that badly.

I meant that the KEP must include e2e tests of the library, so it's clear that it requires implementation, configuration and setup. Maintenance (watching tests and fix faults) I am afraid I have just assumed that the kpng enthusiasts within sig/network would do. But perhaps that too should be in the KEP, so nobody thinks that getting the kpng lib to staging is a fire-and-forget thing.

As I see it, a possible way is to use the existing network e2e tests, but that requires a backend. New test setups and jobs must still be created of course.

I would like to re-write proxy-mode=ipvs to use the kpng library, but I think that is too intrusive, so instead I suggest nft.

A kpng based proxier with nft backend, well tested by the K8s CI, may be a useful addition in itself. Nft has a built-in hash function so the probability ladder in proxy-mode=iptables is not necessary, and the implementation can be isolated much better with nft than with iptables.

astoycos · 2023-01-30T17:03:26Z

Thanks so much for all your comments + questions @danwinship @uablrek @aojea @shaneutt :)

I'm slowly working through each of them and will surely have a bunch of updates to make this week.

At the end of the day I think we've been talking about this for a LONG time (shoutout to @uablrek for posting even more of the existing conversations e.g kubernetes/kubernetes#92369) and IMO we need to pave the way forward for CORE Kubernetes Networking, regardless of what the future is I'm hoping we can at least agree on a very small and concise "first step" in this KEP.

mcluseau · 2023-01-31T06:19:18Z

How it looks like in kpng: https://github.com/kubernetes-sigs/kpng/blob/master/examples/print-state/main.go Last commit back in 2021 ;-) Le mar. 31 janv. 2023 à 07:15, Antonio Ojea ***@***.***> a écrit :

…

***@***.**** commented on this pull request. ------------------------------ In keps/sig-network/3786-migrate-shared-kube-proxy-code-to-staging/README.md <#3788 (comment)> : > +- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] +- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes + + + +[kubernetes.io]: https://kubernetes.io/ +[kubernetes/enhancements]: https://git.k8s.io/enhancements +[kubernetes/kubernetes]: https://git.k8s.io/kubernetes +[kubernetes/website]: https://git.k8s.io/website + +## Summary + +After about a year and a half of testing a [new service proxy implementation](https://github.com/kubernetes-sigs/kpng/), and +collecting sig-network and community feedback, it is clear that a shared library (referred to as the kube-proxy staging library in this document) designed to make building new service proxies easier is needed. Specifically, it is desired by members of the Kubernetes networking community who are attempting to build specialized networking tools. However, in order to prevent the community from having to maintain Great, it seems we are on agreement on what are the problems. I still think that building a new library is much better than carrying with existing code that has a lot of assumptions about iptables details, I liked @thockin <https://github.com/thockin> thoughts he presented in https://docs.google.com/presentation/d/1Y-tZ4fFC9L2NvtBeiIXD1MiJ0ieg_zg4Q6SlFmIax8w/edit?usp=drivesdk&resourcekey=0-SFhIGTpnJT5fo6ZSzQC57g — Reply to this email directly, view it on GitHub <#3788 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA5QROA5YCG7WZRRJYWP5RDWVCUUDANCNFSM6AAAAAAUIAZBWU> . You are receiving this because you were mentioned.Message ID: ***@***.***>

aojea · 2023-01-31T07:35:20Z

How it looks like in kpng: https://github.com/kubernetes-sigs/kpng/blob/master/examples/print-state/main.go

🤔

that is the part I don't understand, why we are claiming that we are doing this because there is a strong desire and demand and we already have a solution?

it kinds of feel that we want to put the code in k/k as a marketing thing

danwinship · 2023-01-31T13:25:38Z

we already have a solution?

We don't have a solution, we have two solutions, with distinct sets of bugs and features.

aojea · 2023-01-31T15:54:26Z

Then why don't we propose what we want, it was not clear from the KEP and the comments that this was the main goal.

"moving the code of kube-proxy to staging" as it is doesn't make any sense to me honestly.
However, refactoring current code and defining a nice API makes sense, and we can discuss these things in the KEP,

Goal

Define a public API so users can create control planes for their Service proxies
Refactor and improve current code, targeting especially the duplication of caches and the handling of updates

Non-goal

Add more backends or data planes implementations
Add new code with the goal of testing it in k/k

Proposal

Define a new API (add things like this per example https://docs.google.com/presentation/d/1Y-tZ4fFC9L2NvtBeiIXD1MiJ0ieg_zg4Q6SlFmIax8w/edit?usp=drivesdk&resourcekey=0-SFhIGTpnJT5fo6ZSzQC57g) that can be reused for implementing new proxies bla bla bla

Design Details

The new API will have the following types, functions, .... (and we review it in the KEP, simplifying the subsequent PRs)

// Handles the addresses used for NodePorts in a node
type NodePortAddress  struct{}

func (n NodePortaddress) GetNodeAddresses() [] string

// 
type Topology struct

//

Testing

The code will be migrated incrementally, exporting the different functions individually adding the corresponding unit test and verifying that there are no regressions.

Since most of the changes are related to the control-plane we'll add unit test and benchmarks verifying the improvements
...

thockin · 2023-01-31T21:17:35Z

I haven't reads this all (just the comments) but I agree that the current code isn't something I'd be super excited about telling people to use as it exists.

But cleaning it up sounds OK, if it furthers the goal of having less projects totally reimplement it all on their own.

jayunit100 · 2023-02-01T12:49:20Z

"moving the code of kube-proxy to staging" as it is doesn't make any sense to me honestly.

fair point for sure @aojea ... im not suggesting anything one way or other here - but ill just float an idea out here for others to decide:

... one alternative approach could be to just design the thing we want, from first principles as antonio suggests above
then file issues against KPNG to refactor it over time into more like what sig-network would want in an out-of-tree implementation?
- this admittedly would be hard and we'd need to work very closely w/ dan and tim and antonio and mikael and many others... but.... the net effort long term might be lower then iterating around in-tree efforts, which doesnt seem to have consensus in the sig.

In some ways, we just did this the last few weeks, in https://github.com/kubernetes-sigs/windows-service-proxy where we basically forked off our own to the k2s.go, and made a single integrated binary, which was alot of the feedback folks had (i.e. do we need GRPC?), and consumed KPNG itself as a library...

... (as a way to streamline that... maybe we just literally do that at the next kubecon - bc its clear most of the people who are interested will be there) ...

astoycos · 2023-02-02T05:48:06Z

I love all the action here :)

Just really quickly I'd like to take a step back to hopefully help anyone who hasn't been around the whole time, since I feel like we've gone back and forth based on a few key roadblocks (I'll probably add this back into the KEP as well)

~~HISTORY: CLICK ME TO READ A NOVEL~~

The initial KPNG KEP -> KEP-2104: kube-proxy v2: reworking the proxy's architecture #2094

It describes KPNG as it stands today, a great tool which essentially solves two major problems

Complex proxy deployment scenarios
Makes it simple to write new service proxy backends

However today(we could totally change this) these features are tightly coupled, i.e you need to buy into the GRPC API (which solves complex deployment) in order to get the benefits which make writing new backends easier.

On this Kep wayyyy back in 2021 :) @thockin pushed hard for KPNG to move towards a library (see this comment) and then again this year in a nice slide dec around Kubecon NA.

I initially did a poc of moving KPNG towards the comment's/sildeshow's suggested design in kubernetes-retired/kpng#389 and now based on @jayunit100's comment they're doing something similar for the windows bits.

Regardless out of all this work on KPNG and the KEP #2094 we never really figured out one of the most crucial questions... i.e

Should any of the work ever get back in-tree and how does it play into the future of the existing kube-proxies.

However, we did mostly agree that KPNG is wayy too big to ever make it back into tree as it stands today, so I'll leave it at that for now.

Next came the spinoff lib KEP looking at how we could create such a library ->

[WIP do not merge] Kube-Proxy Library: Breakout a KPNG-like interface "kube proxy lib" from #2104 #3649

In this KEP we tried to ONLY target building exactly what @thockin an others were after with a nice library which focused on making backends easier to write.

Also in order to avoid duplicating code maintenance efforts and functionality bits we began talking and thinking about actually using it in the existing proxies which began to answer that initial unanswered question.

However that involved possibly jeopardizing the stability of the existing super stable proxies so let's make sure the transition is super slow.......enter this KEP

KEP-3786: Migrate Shared Kube-Proxy code into staging #3788 (read the KEP)
A tiny step forward focusing on simply making the stable functionality we already have in our proxies useable for others.

At the end of the day I'm a bit impartial to what we decide to do moving forward (as long as we do come to a consensus + conclusion eventually) but for any future work it would be great to balance the following high level things:

Answer, Should any of the work ever get back in-tree?
Answer, How does the work play into the future of the existing kube-proxies?
Ensure, That we aren't splitting our already strained OS resources on too many things.

For example:

If we as a sig want to go with what @aojea posted above and believe it will live somewhere "in-tree", can we agree to ensure the new functionality is used by existing proxies to avoid the maintenance of two similar sets of code.

OR

If we as a sig want to commit to KPNG (the out-of-tree solution) can we put a stake in the ground saying we'll do our best to maintain the existing stability of Kube-Proxy but all new feature/proxy development efforts and focus should move to KPNG.

astoycos · 2023-02-02T18:06:04Z

Working on re-factoring this following the sig-net meeting on February 2nd.....

aojea · 2023-02-08T12:34:49Z

The feedback from one of the current consumers is that they only need the control plane

kubernetes/kubernetes#92369 (comment)

copying it here for visibility

We don't need iptables and ipvs since we have our own data-plan like iptables or ipvs.

For the earlier version of our implementation, we only forked service.go, types.go, endpoints.go and runner.go as a framework and added our own data-plane code; later when we added feature Endpointslice, we also need to forkendpointslicecache.go.

To catch up with the features like TopologyAwareHints and ProxyTerminatingEndpoints in our CNI, we even need to learn the logic in topology.go and add the logic to our code to categorize Endpoints(like this function

[kubernetes/pkg/proxy/topology.go](https://github.com/kubernetes/kubernetes/blob/b7ad17978eaba508c560f81bab2fb9d0b256e469/pkg/proxy/topology.go#L40)

Line 40 in [b7ad179](https://github.com/kubernetes/kubernetes/commit/b7ad17978eaba508c560f81bab2fb9d0b256e469)

 func CategorizeEndpoints(endpoints []Endpoint, svcInfo ServicePort, nodeLabels map[string]string) (clusterEndpoints, localEndpoints, allReachableEndpoints []Endpoint, hasAnyEndpoints bool) { 
). If topology.go and corresponding files could be used as a library, we could use the latest feature like TopologyAwareHints or ProxyTerminatingEndpoints by upgrading the library.

danwinship · 2023-02-09T14:22:29Z

OK, so with #3866 officially proposed now, let's add a goal that it should be possible to implement that proxy out-of-tree, without unnecessarily duplicating any code used by the itpables and ipvs proxies. (This doesn't mean that I will implement it out-of-tree; this just helps explain what does and doesn't need to be staged.)

astoycos · 2023-03-17T14:54:59Z

/close

in favor of the ongoing cleanup and refactoring work going on with the nftables backend effort #3824

k8s-ci-robot · 2023-03-17T14:55:06Z

@astoycos: Closed this PR.

In response to this:

/close

in favor of #3824

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

mcluseau and others added 21 commits January 26, 2023 15:56

Add user story that clarifies the performance tuning available via se…

ca4b117

…paration of the APIServer uplink

add user story about new languages

50b6994

Update README.md

7b05bcf

Update README.md

2089b3d

update TOC

9e4c077

kube-proxy-lib initial stab at minimal KEP will add details later

317119b

cleanup

e13b15c

Updates

e87176a

retitle

2a0152f

link to tims slides

37f26b2

note from per and dan about long term goals

8c1da21

Update keps/sig-network/2104-kube-proxy-lib/README.md

ae3542e

Co-authored-by: Dan Winship <danwinship@redhat.com>

Update keps/sig-network/2104-kube-proxy-lib/README.md

5e5c72c

Co-authored-by: Dan Winship <danwinship@redhat.com>

Update keps/sig-network/2104-kube-proxy-lib/README.md

b5270a3

Co-authored-by: Dan Winship <danwinship@redhat.com>

Apply suggestions from code review

0efe72e

Co-authored-by: Dan Winship <danwinship@redhat.com>

Update README.md

9e4cb1b

Update README.md

73fc0d5

Migrate "kube-proxy-lib" to new issue

4c43ad4

Signed-off-by: Andrew Stoycos <astoycos@redhat.com>

Additional updates

4ada79d

Signed-off-by: Andrew Stoycos <astoycos@redhat.com>

k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. sig/network Categorizes an issue or PR as relevant to SIG Network. labels Jan 26, 2023

k8s-ci-robot requested review from caseydavenport and dcbw January 26, 2023 22:18

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jan 26, 2023

astoycos mentioned this pull request Jan 26, 2023

[WIP do not merge] Kube-Proxy Library: Breakout a KPNG-like interface "kube proxy lib" from #2104 #3649

Closed

aojea reviewed Jan 29, 2023

View reviewed changes

shaneutt reviewed Jan 30, 2023

View reviewed changes

k8s-ci-robot closed this Mar 17, 2023

kikisdeliveryservice mentioned this pull request May 12, 2023

Migrate Shared Kube-Proxy Code into Staging #3786

Closed

aojea mentioned this pull request Feb 14, 2024

KNI [Kubernetes Networking Interface] Initial Draft KEP #4477

Closed


		## Alternatives

		We could retain the existing kube proxy shared code in core kubernetes and simply better document the data structures and golang maps used for kubernetes client operations and client side caching. However, that would still require external proxy implementations to copy and paste large amounts of code. The other option is to not tackle this problem in-tree and to move forward with the singular development of external projects like KPNG as the overall framework for solving these problems. The drawbacks to this include the large additional maintenance burden, and that it is opinionated towards a raw GRPC implementation and other users (i.e. XDS) want something more decoupled possibly. This realization has inspired this KEP.

		A previous attempt at a broad solution to this problem was explored in the [KPNG project](https://github.com/kubernetes-sigs/kpng/), which exhibits many properties that allow for such goals to be accomplished. However, because it introduced many new features and would result in large breaking changes if it were
		to be incorporated back in tree, it became clear we needed to decompose this large task into smaller pieces. Therefore, we've decided to keep things simple and start by moving the existing shared kube-proxy code into staging where it can be iterated on and augmented to in an safe, consumable and incremental manner.


		#### Story 3

		As a Kubernetes developer I want to make maintaining the shared proxy code easier, and allow for updates to that code to be completed in a more incremental and well tested way.

KEP-3786: Migrate Shared Kube-Proxy code into staging #3788

KEP-3786: Migrate Shared Kube-Proxy code into staging #3788

Conversation

astoycos commented Jan 26, 2023 • edited Loading

k8s-ci-robot commented Jan 26, 2023

astoycos commented Jan 26, 2023

danwinship commented Jan 27, 2023

uablrek commented Jan 29, 2023 • edited Loading

aojea commented Jan 29, 2023

Choose a reason for hiding this comment

astoycos Jan 30, 2023 • edited Loading

Choose a reason for hiding this comment

aojea commented Jan 29, 2023

Choose a reason for hiding this comment

danwinship Jan 30, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

astoycos Jan 30, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aojea Jan 30, 2023 • edited Loading

Choose a reason for hiding this comment

shaneutt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

uablrek commented Jan 30, 2023

astoycos commented Jan 30, 2023 • edited Loading

mcluseau commented Jan 31, 2023 via email

aojea commented Jan 31, 2023

danwinship commented Jan 31, 2023

aojea commented Jan 31, 2023

Goal

Non-goal

Proposal

Design Details

Testing

thockin commented Jan 31, 2023

jayunit100 commented Feb 1, 2023 • edited Loading

astoycos commented Feb 2, 2023 • edited Loading

astoycos commented Feb 2, 2023

aojea commented Feb 8, 2023

danwinship commented Feb 9, 2023

astoycos commented Mar 17, 2023 • edited Loading

k8s-ci-robot commented Mar 17, 2023

astoycos commented Jan 26, 2023 •

edited

Loading

uablrek commented Jan 29, 2023 •

edited

Loading

astoycos Jan 30, 2023 •

edited

Loading

danwinship Jan 30, 2023 •

edited

Loading

astoycos Jan 30, 2023 •

edited

Loading

aojea Jan 30, 2023 •

edited

Loading

astoycos commented Jan 30, 2023 •

edited

Loading

jayunit100 commented Feb 1, 2023 •

edited

Loading

astoycos commented Feb 2, 2023 •

edited

Loading

astoycos commented Mar 17, 2023 •

edited

Loading