Multicast stats imple #10

ceclinux · 2022-05-11T08:01:54Z

No description provided.

…o#3508) 1. Add local Pod receivers into an OpenFlow type "all" group for each multicast group, and use such groups in the flow actions. Remove a Pod from group buckets if the Pod has left the multicast group or is deleted before leaving the multicast group. 2. Improve multicast e2e tests. Signed-off-by: wenyingd <wenyingd@vmware.com> Co-authored-by: Ruochen Shen <src655@gmail.com>

The taints for control-plane Nodes are changed for cluster version >= 1.24. Add a new toleration for Pods running on control-plane Nodes to make sure they can be scheduled. Signed-off-by: Xu Liu <xliu2@vmware.com>

Signed-off-by: Qiyue Yao <yaoq@vmware.com>

…trea-io#3730) The assumption that a secret token is automatically created for each ServiceAccount is not valid anymore, starting with K8s v1.24. We make the following changes: - create a Secret in our reference Prometheus manifest for the `prometheus` ServiceAccount; this addresses the failure in Prometheus e2e tests. - create a Secret in the Antrea manifest for the antctl and antrea-agent ServiceAccount (but not for the antrea-controller ServiceAccount). Note that the token for the antrea-agent ServiceAccount is only really necessary when running the Antrea Agent as a process on Windows Nodes, so in the future we may want to conditionally generate the Secret only for that use case. - remove the manual installation instructions: the instructions require a token for the antrea-controller ServiceAccount, but we don't have one anymore. The instructions are also outdated (e.g., we no longer have ready-to-use RBAC manifests), and running the Agent manually is likely to work sub-optimally (no downward API so missing environment variables). Fixes antrea-io#3729 Signed-off-by: Antonin Bas <abas@vmware.com>

Bumps [docker/login-action](https://github.com/docker/login-action) from 1 to 2. - [Release notes](https://github.com/docker/login-action/releases) - [Commits](docker/login-action@v1...v2) --- updated-dependencies: - dependency-name: docker/login-action dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bumps [docker/setup-qemu-action](https://github.com/docker/setup-qemu-action) from 1 to 2. - [Release notes](https://github.com/docker/setup-qemu-action/releases) - [Commits](docker/setup-qemu-action@v1...v2) --- updated-dependencies: - dependency-name: docker/setup-qemu-action dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bumps [docker/build-push-action](https://github.com/docker/build-push-action) from 2 to 3. - [Release notes](https://github.com/docker/build-push-action/releases) - [Commits](docker/build-push-action@v2...v3) --- updated-dependencies: - dependency-name: docker/build-push-action dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bumps [docker/setup-buildx-action](https://github.com/docker/setup-buildx-action) from 1 to 2. - [Release notes](https://github.com/docker/setup-buildx-action/releases) - [Commits](docker/setup-buildx-action@v1...v2) --- updated-dependencies: - dependency-name: docker/setup-buildx-action dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Signed-off-by: Yanjun Zhou <zhouya@vmware.com>

Remove the update permission for services/status of antrea-agent service account. Remove the optimization for ExternalTrafficPolicy setting to Local cases in ServiceExternalIP feature accordingly. Introduce "antctl get serviceexternalip" command for the agent to make checking the assigned Node of external IPs easier. Signed-off-by: Xu Liu <xliu2@vmware.com>

…sterset (antrea-io#3736) Signed-off-by: hujiajing <hjiajing@vmware.com>

The `mockgen` in the script will call default mockgen binary on the system by default if the function is not defined before caller. so rename it and move it before caller. Signed-off-by: Lan Luo <luola@vmware.com>

Add E2E tests and related content in doc for ICMP support PR antrea-io#3472 Signed-off-by: wgrayson <wgrayson@vmware.com>

Signed-off-by: gran <gran@vmware.com>

This commit increased the priority of the flow matching ct_state=+inv+trk from low to high and remove some useless reject bypass flows. 1. Increase priority There are two flows in ConntrackState table with the same priority: `...priority=190,ct_state=+inv+trk,ip actions=drop` `...priority=190,ct_state=-new+trk,ip actions=resubmit (,AntreaPolicyEgressRule)` If a packet could be matched by both of those flows, we couldn't make sure which flow will be hit. 2. Remove useless reject flows There are some flows used for previous reject bypass logic. Remove them since they are not in use anymore. Signed-off-by: wgrayson <wgrayson@vmware.com>

Set the svg background to white so it renders clearly in dark mode. Signed-off-by: Lan Luo <luola@vmware.com>

* Remove ELK Flow Collector This PR removes ELK Flow Collector's related files: 1. manifests under build/yamls/elk-flow-collector 2. jenkins CI validation job 3. quick deployment options in scripts 4. documentation Signed-off-by: heanlan <hanlan@vmware.com> * Fix dead markdown link in ci/jenkins/README.md Signed-off-by: heanlan <hanlan@vmware.com>

This change fixes a couple of minor issues in current Kind e2e test: - GitHub Kind workflow cleanup not waiting for flow visibility job - test-e2e-kind.sh pulling the wrong busybox image - FlowAggregator image being wrongly included for no-FA e2e test Signed-off-by: Shawn Wang <wshaoquan@vmware.com>

Signed-off-by: Lan Luo <luola@vmware.com>

1. Use containerd runtime as creation of node pools using node images based on Docker container runtimes is not supported in GKE v1.23, which is now the default version. 2. Pin conformance image version to v1.20.15 as there is a problem with the netpol suite added in v1.21.0 when running on GKE. Signed-off-by: Quan Tian <qtian@vmware.com>

To support performing both SNAT and DNAT for traffic, Antrea uses two CT zones for SNAT and DNAT separately. For each packet, multiple CT actions are executed to go through the zones. And because SNAT is performed after DNAT, reply traffic wouldn't be unNATed correctly if they go through the zones in the same order as request traffic, an extra CT action for unSNAT was added before DNAT to resolve it. These CT actions introduce measurable overhead to the dataplane. Since the first unSNAT action is for reply traffic of SNATed connections only, and there are only few cases needing SNAT, this patch adds conditions to the unSNAT flow to make irrelevant traffic bypass it. With less CT action and less recirculation caused by it, the dataplane performance is significantly increased. TCP_RR and TCP_CRR improvement in a kind cluster is as below: ``` Test old TPS new TPS delta TCP_RR 14568.69 17826.26 +22.36% TCP_CRR 2781.7 3498.12 +25.75% ``` Signed-off-by: Quan Tian <qtian@vmware.com>

Signed-off-by: Antonin Bas <abas@vmware.com>

Add a quick-start guide of setting up a ClusterSet with two clusters. Update the user guide with Multi-cluster Gateway configuration. Signed-off-by: Lan Luo <luola@vmware.com> Co-authored-by: Jianjun Shen <shenj@vmware.com>

Add new subcommands to Create or Delete multi-cluster Resources. Signed-off-by: hjiajing <hjiajing@vmware.com>

…n in L3ForwardingTable (antrea-io#3809) Fix antrea-io#3806 Currently, when Egress and AntreaIPAM are enabled, there are several flows in L3ForwardingTable like the follows: ``` 1. table=L3Forwarding, priority=190,ip,reg0=0/0x200,reg8=0/0xfff,nw_dst=10.10.0.0/24 actions=resubmit(,L2ForwardingCalc) 2. table=L3Forwarding, priority=190,ct_mark=0x10/0x10,reg0=0x200/0x200,reg4=0/0x100000 actions=mod_dl_dst:d2:35:24:7f:3a:f8,load:0x2->NXM_NX_REG0[4..7],resubmit(,L3DecTTL) 3. table=L3Forwarding, priority=190,ct_mark=0x10/0x10,reg4=0x100000/0x100000 actions=resubmit(,L3DecTTL) 4. table=L3Forwarding, priority=190,ct_state=-rpl+trk,ip,reg0=0x3/0xf actions=resubmit(,EgressMark) 5. table=L3Forwarding, priority=190,ct_state=-rpl+trk,ip,reg0=0x1/0xf actions=mod_dl_dst:d2:35:24:7f:3a:f8,resubmit(,EgressMark) 6. table=L3Forwarding, priority=0 actions=load:0x2->NXM_NX_REG0[4..7],resubmit(,L2ForwardingCalc) ``` - Flow 1 is used to forward the packets of non-Service connections between local Pods. - Flow 2 is used to forward the packets of Service connections sourced from local non-AntreaIPAM Pods and destined for external network Endpoint. - Flow 3 is used to forward the packets of Service connections sourced from local AntreaIPAM Pods and destined for external network Endpoint. - Flow 4 is used to forward the packets sourced from local Pods and destined for external network, and the flow is for Egress. - Flow 5 is used to forward the packets sourced from tunnel and destined for external network, and the flow is also for Egress. - Flow 6 is the default flow. `load:0x2->NXM_NX_REG0[4..7]` means `to Gateway`. For request packets sourced from a local Pod and destined for another local Pod, they are expected to be matched by flow 1, however, they can be matched by flow 4, and this leads the unexpected result in antrea-io#3806. In addition, when Egress is enabled, if a local Pod accesses to a Service whose Endpoint is on external Network, the request packets are expected to be matched by flow 4, but they can be also matched by flow 2. To resolve above issues, the new flows are like the follows: ``` 1. table=L3Forwarding, priority=200,ip,reg0=0/0x200,reg8=0/0xfff,nw_dst=10.10.0.0/24 actions=resubmit(,L2ForwardingCalc) 2. table=L3Forwarding, priority=190,ct_mark=0x10/0x10,reg0=0x202/0x20f actions=mod_dl_dst:d2:35:24:7f:3a:f8,load:0x2->NXM_NX_REG0[4..7],resubmit(,L3DecTTL) 3. table=L3Forwarding, priority=190,ct_state=-rpl+trk,ip,reg0=0x3/0xf,reg4=0/0x100000 actions=resubmit(,EgressMark) 4. table=L3Forwarding, priority=190,ct_state=-rpl+trk,ip,reg0=0x1/0xf actions=mod_dl_dst:d2:35:24:7f:3a:f8,resubmit(,EgressMark) 5. table=L3Forwarding, priority=0 actions=load:0x2->NXM_NX_REG0[4..7],resubmit(,L2ForwardingCalc) ``` Issue antrea-io#3806 is fixed by raising the priority of the legacy flow 1 to normal (200). It is the new flow 1 now. In legacy flow 2, packets of Service connections with unknown destination can be either sourced from local Pods or Antrea gateway. In the new flow 2, only the packets of Service connections sourced from Antrea gateway are matched. Packets of connections sourced from local Pods and destined for external network are matched by the new flow 3 (Egress is enabled) or the new flow 5 (Egress is disabled). Packets of connections sourced from local AntreaIPAM Pods (AntreaIPAM is enabled) and destined for external network are matched by the new flow 5. Other modifications: fix some stale comments. Signed-off-by: Hongliang Liu <lhongliang@vmware.com>

Signed-off-by: wenyingd <wenyingd@vmware.com>

For multicast traffic, we support ingress rules for IGMP, and egress rules for multicast data traffic. And apply NetworkPolicy to real traffic for both. Ingress for multicast traffic is not supported now. While egress for IGMP only supports IGMP report, which is handled by packetIn. This patch maintains a rule map for each group address to fetch the rule which matches the member and also has the highest priority. And packetIn will decide to allow or drop the IGMP report traffic based on the matched rule. Signed-off-by: Bin Liu <biliu@vmware.com>

* Add a new feature gate `Multicluster` and configs in antrea-agent.conf, and a few extra items in antrea-agent cluster role including access to `Gateway` and `ClusterInfoImport`. * Rename the `ServiceMarkTable` to `SNATMarkTable`. * Add a controller for Gateway Nodes to watch Gateway and ClusterInfoImport's events. It will set up a few openflow rules to forward cross-cluster traffic to remote Gateway Nodes. * Add a classification rule for cross-cluster traffic with global multicluster virtual MAC `aa:bb:cc:dd:ee:f0`. * Add a rule in `L3Forwarding` table for cross-cluster request packets that modifies the destination MAC to global multicluster virtual MAC. * Add a rule in `L3Forwarding` table for cross-cluster reply packets. * Add a rule to `SNATMark` table to match the packets of multi-cluster Service connection and perform DNAT in DNAT zone. * Add a rule to `SNAT` table to perform SNAT for any remote cluster traffic. * Add a rule to `UnSNAT` table to perform de-SNAT if destination IP is local GatewayIP. * Add a rule in `L2ForwardingCalc` table to load the global virtual multi-cluster MAC's output to `antrea-tun0`. * Add a rule in `Output` table to match the multi-cluster traffic to forward the traffic from/to regular Node through the same port. * Add a controller for regular Nodes to watch Gateway and ClusterInfoImport's events. It will set up a few openflow rules to forward cross-cluster traffic to local Gateway Node. * Add a rule in L3Forwarding table for cross-cluster request packets, and modify the destination MAC to global multicluster virtual MAC. * Add a rule in L3Forwarding table for cross-cluster reply packets. * Add a rule in L2ForwardingCalc table to load the global virtual multi-cluster MAC's output to `antrea-tun0`. * Use Service ClusterIPs instead of Pod IPs as MC Service's Endpoints. The ServiceExport controller will only watch ServiceExport and Service events, and wrap Service's ClusterIPs into a new Endpoint kind of ResourceExport. * Include local Service ClusterIP as multi-cluster Service's Endpoints as well. * Add unit test cases * Refine e2e test for data plane change Signed-off-by: Lan Luo <luola@vmware.com> Co-authored-by: Hongliang Liu <lhongliang@vmware.com>

Signed-off-by: gran <gran@vmware.com>

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>

The failed case has updated both Service and Pod. The issue happens when Service update is processed by NPL controller before Pod update. In this case, the pod2 annotation was fully removed firstly. Then NPL controller added a new nodeport with both tcp and udp rules to pod2 by service update. As the result, the pod2 annotation was: '[{80 10.10.10.10 61002 tcp [tcp]} {80 10.10.10.10 61002 udp [udp]}]'. The fix removes unnecessary label update for pod2 to avoid unexpected annotation in race condition. Annotation after fix: '[{80 10.10.10.10 61001 tcp [tcp]} {80 10.10.10.10 61001 udp [udp]}]' Fixes antrea-io#3847 Signed-off-by: Shuyang Xin <gavinx@vmware.com>

…a-io#3487) - Use label selectors to filter Pods running on current Node. - Translate the selected Pods to OVS ports, which will be used to filter the packets that should be mirrored or redirected. - Translate the target device to the OVS port, which will be used as the target port the traffic should be mirrored or redirected. - Install OpenFlow rules calculated using the above arguments. Signed-off-by: Hongliang Liu <lhongliang@vmware.com> Co-authored-by: Quan Tian <qtian@vmware.com> Co-authored-by: Wenqi Qiu <wenqiq@vmware.com>

Signed-off-by: Yang Ding <dingyang@vmware.com>

…ntrea-io#3862) For a NodePort connection sourced from external network or local Node, destination IP will be DNATed with a virtual IP, then the connection will be forwarded to OVS via Antrea gateway. However, in UnSNATTable, a flow is installed to unSNAT replied packets of SNATed connections by matching the virtual IP as destination IP. The flow is like the following: ``` table=UnSNAT, priority=200,ip,nw_dst=169.254.0.253 actions=ct(table=ConntrackZone,zone=65521,nat) ``` The request packets of a DNATed NodePort connection are also matched by the flow above, but it is unnecessary. To optimize the performance of NodePort, this commit adds another virtual IP to identify and DNAT NodePort connections. TCP_RR and TCP_CRR improvement is like below: ``` Test old TPS new TPS delta TCP_CRR 3510.28 3847.76 +%9.61 TCP_RR 9574.29 10457.6 +%9.23 ``` Signed-off-by: Hongliang Liu <lhongliang@vmware.com>

IPv6 address must be wrapped with "[]" when used in network API. This patch ensures that the auto-discovered and the user-provided DNS servers use correct format. It also adds the missing configuration "dnsServerOverride" to the configuration file. Signed-off-by: Quan Tian <qtian@vmware.com>

…trea-io#3888) Change the default Namespace from `changeme` to `antrea-multicluster` for the leader manifest, so user can deploy leader controller without any Namespace replacement step by default. Signed-off-by: Lan Luo <luola@vmware.com> Co-authored-by: Jianjun Shen <shenj@vmware.com>

…io#3885) Signed-off-by: Jianjun Shen <shenj@vmware.com>

…#3886) Signed-off-by: Jianjun Shen <shenj@vmware.com>

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>

This PR added multicast statistics support for the following cases: - Supplements current networkpolicy statistics implementation by parsing multicast related flows, which can be displayed as kubectl get antreanetworkpolicystats multicast-networkpolicy-name. - Add a node-level antctl command antctl get podmulticaststats, showing inbound and outbound packet count for each pod interface. - Add an extra kubectl get multicastgroups command. This command shows which pods have joined multicast group for the whole cluster. Signed-off-by: ceclinux <src655@gmail.com>

* e2e: add cases for multicast NetworkPolicy * add documentation for multicast NetworkPolicy Signed-off-by: Bin Liu <biliu@vmware.com>

Signed-off-by: ceclinux <src655@gmail.com>

wenyingd and others added 19 commits May 6, 2022 13:20

Fix tolerations for Kubernetes >= 1.24 (antrea-io#3731)

7673d42

The taints for control-plane Nodes are changed for cluster version >= 1.24. Add a new toleration for Pods running on control-plane Nodes to make sure they can be scheduled. Signed-off-by: Xu Liu <xliu2@vmware.com>

add egress validate ns match (antrea-io#3727)

fe21260

Signed-off-by: Qiyue Yao <yaoq@vmware.com>

Support ClickHouse deployment with Persistent Volume (antrea-io#3608)

92dded2

Signed-off-by: Yanjun Zhou <zhouya@vmware.com>

Add some multi-cluster resources template YAML files to setup the clu…

c59f35c

…sterset (antrea-io#3736) Signed-off-by: hujiajing <hjiajing@vmware.com>

Fix mockgen target in Makefile (antrea-io#3707)

176e9e0

The `mockgen` in the script will call default mockgen binary on the system by default if the function is not defined before caller. so rename it and move it before caller. Signed-off-by: Lan Luo <luola@vmware.com>

E2E test of Antrea native policy ICMP support (antrea-io#3635)

ac44d6b

Add E2E tests and related content in doc for ICMP support PR antrea-io#3472 Signed-off-by: wgrayson <wgrayson@vmware.com>

Enable traceflow e2e test on Windows (antrea-io#3022)

94311f9

Signed-off-by: gran <gran@vmware.com>

Update svg background (antrea-io#3756)

4e971fd

Set the svg background to white so it renders clearly in dark mode. Signed-off-by: Lan Luo <luola@vmware.com>

Add GOPROXY support for code generation (antrea-io#3767)

30cc8c9

Signed-off-by: Lan Luo <luola@vmware.com>

ceclinux force-pushed the multicast_stats_imple branch 10 times, most recently from 4060b0e to 5a26c8e Compare May 12, 2022 01:01

ceclinux force-pushed the multicast_stats_imple branch from 4165693 to ce057df Compare June 5, 2022 10:13

tnqn and others added 7 commits June 6, 2022 12:01

Add known issues section to Vagrant documentation (antrea-io#3579)

5115ee2

Signed-off-by: Antonin Bas <abas@vmware.com>

Update Multi-cluster user guide and add quick-start

62a0145

Add a quick-start guide of setting up a ClusterSet with two clusters. Update the user guide with Multi-cluster Gateway configuration. Signed-off-by: Lan Luo <luola@vmware.com> Co-authored-by: Jianjun Shen <shenj@vmware.com>

multi-cluster bootstrap in antctl (antrea-io#3474)

3e1b254

Add new subcommands to Create or Delete multi-cluster Resources. Signed-off-by: hjiajing <hjiajing@vmware.com>

Compile antctl together with Agent bits on Windows (antrea-io#3867)

8e8aafb

Signed-off-by: wenyingd <wenyingd@vmware.com>

ceclinux force-pushed the multicast_stats_imple branch from ce057df to 2265153 Compare June 9, 2022 12:56

ceclinux force-pushed the multicast_stats_imple branch from 2265153 to 18072cd Compare June 9, 2022 17:07

[e2e][flexible-ipam] Fix TestPrometheus failed (antrea-io#3868)

dcd1019

Signed-off-by: gran <gran@vmware.com>

ceclinux force-pushed the multicast_stats_imple branch from 18072cd to ae757f1 Compare June 10, 2022 02:40

Fix unit test in pkg/agent/openflow/pipeline_test.go (antrea-io#3881)

0959e37

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>

ceclinux force-pushed the multicast_stats_imple branch from ae757f1 to 584d0f3 Compare June 10, 2022 06:19

XinShuYang and others added 11 commits June 10, 2022 14:37

Improve documentation for Antrea-native policy (antrea-io#3512)

8eedd9f

Signed-off-by: Yang Ding <dingyang@vmware.com>

Fix pool CRD format in egress.md and service-loadbalancer.md (antrea-…

cd124a0

…io#3885) Signed-off-by: Jianjun Shen <shenj@vmware.com>

Add information about Theia in network-flow-visibility doc (antrea-io…

2eb48ee

…#3886) Signed-off-by: Jianjun Shen <shenj@vmware.com>

Fix ip6tables restore in pkg/agent/route/route_linux.go (antrea-io#3891)

b8279c9

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>

Multicast networkPolicy e2e and documentation (antrea-io#3792)

6c8acea

* e2e: add cases for multicast NetworkPolicy * add documentation for multicast NetworkPolicy Signed-off-by: Bin Liu <biliu@vmware.com>

ceclinux force-pushed the multicast_stats_imple branch 2 times, most recently from d6e073e to 2d05542 Compare June 15, 2022 07:32

[Multicast] Add multicast statistics e2e tests

4d40665

Signed-off-by: ceclinux <src655@gmail.com>

ceclinux force-pushed the multicast_stats_imple branch from 2d05542 to 4d40665 Compare June 15, 2022 07:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Multicast stats imple #10

Multicast stats imple #10

Uh oh!

ceclinux commented May 11, 2022

Uh oh!

Uh oh!

Multicast stats imple #10

Are you sure you want to change the base?

Multicast stats imple #10

Uh oh!

Conversation

ceclinux commented May 11, 2022

Uh oh!

Uh oh!