Skip to content

Latest commit

 

History

History
123 lines (99 loc) · 15.7 KB

v3.28.0-release-notes.md

File metadata and controls

123 lines (99 loc) · 15.7 KB

10 May 2024

Important upgrade notes

  • This release re-enables VXLAN checksum offload for kernels > 5.7 (this takes effect when nodes are restarted). The offload was disabled in the past due to kernel bugs but has substantial performance impact. We've been unable to find an environment where this needs to be disabled, so are re-enabling the offload with the intent of proving it works or finding a setup where it doesn't. If you encounter problems after upgrading and restarting nodes, please let us know and set "ChecksumOffloadBroken=true" in the FelixConfiguration's featureDetectOverride field to restore the previous behavior. calico #8774 (@tomastigera)

  • Breaking change: On upgrade, the UID of projectcalico.org/v3 resources will change. If you are using the Calico API server, it is recommended that you restart any controllers that manage projectcalico.org/v3 API resources after upgrading Calico, including the kube-controller-manager. This change was necessary in order to fix an issue where duplicate UIDs could be seen on different API resources, confusing Kubernetes garbage collection. calico #8586 (@caseydavenport)

GA support for IPv6 eBPF data plane

Support for IPv6 and dual-stack when using Calico's eBPF data plane is now GA.

Ensure pods are marked ready only after policies are rendered in dataplane

The Calico CNI plugin can now optionally delay pod startup until after the local dataplane has been programmed. This may be desired in high churn scenarios to ensure network connectivity is available from the moment the application software starts.

Support for multiple IP pools in Tigera Operator installs

The tigera operator now supports configuring multiple IP pools at install time using the Installation API. IP pools can be modified after install via the Installation API as well.

Bug fixes

CI/CD integration: Please ensure that CI/CD systems configured to deploy Calico resources use the projectcalico.org/v3 API group. Resources deployed via the crd.projectcalico.org/v1 API group are for internal use only and may not sync properly.

General

  • Fix bug that inhibited garbage collection of Namespaces and ServiceAccounts with OwnerReferences. calico #8586 (@caseydavenport)
  • Fix that projectcalico.org/v3 resources with OwnerReferences were unable to be garbage collected due to non-unique UIDs. calico #8586 (@caseydavenport)
  • apiserver defaults logrus level based on -v argument calico #8699 (@caseydavenport)
  • Fix missing log line numbers in cni-installer log output calico #8698 (@caseydavenport)
  • Fix bug where key usage was not consistent calico #8581 (@rene-dekker)
  • calicoctl node run no longer executes the Kubernetes token watcher, which can only run inside a Kubernetes pod. calico #8483 (@fasaxc)
  • Fix missing permissions when uninstalling tigera-operator. calico #8413 (@KonstantinVishnivetskii)
  • Route reflector nodes now properly advertise Service LoadBalancer IP addresses even if there is no local endpoint on the node. calico #8358 (@AMacedoP)
  • Fix source IP spoofing annotation being ignored in etcd datastore mode calico #8347 (@fm9282)
  • Node learns about it's ipv6 address in kubernetes even if BGP is turned off and CNI is not calico. calico #8209 (@tomastigera)
  • Fix that cross-subnet routes were not moved when the VXLAN parent device was changed. calico #8279 (@fasaxc)

Windows

  • Add running of token refresher to Calico for Windows. calico #8563 (@coutinhop)
  • Fix confd issues when running on Windows operator installations (using HPC). calico #8421 (@coutinhop)
  • Fixed AutoCreateServiceAccountTokenSecret param handling in install-calico-windows.ps1 calico #8365 (@coutinhop)
  • Use KUBECONFIG env variable to build cluster config calico #8549 (@skmatti)

Other changes

General

  • Re-enable VXLAN checksum offload for kernels > 5.7 (takes effect when nodes are restarted). calico #8774 (@tomastigera)
  • Calico is now built with Go 1.22.3 against Kubernetes v1.28.7. Moving to Go 1.22 fixed a couple of latent bugs, detected by the new for loop semantics. calico #8717 (@fasaxc)
  • Calico now builds against Kubernetes v1.28.9 calico #8733 (@fasaxc)
  • Update flannel version to v0.24.3. calico #8595 (@laibe)
  • Bump iptables version of calico-node to 1.8.8 calico #8416 (@cyclinder)
  • Bump github.com/containerd/containerd from 1.6.23 to 1.6.26 calico #8355 (@dependabot[bot])
  • Bump github.com/opencontainers/runc from 1.1.6 to 1.1.12 calico #8468 (@dependabot[bot])
  • Update upstream CNI plugins and Flannel downloads to latest golang patches calico #8307 (@matthewdupre)
  • The calico/node-driver-registrar image now has labels for description/maintainers/etc as required by OpenShift certification. calico #8730 (@fasaxc)
  • Migrate to UBI based go-build calico #8103 (@hjiawei)
  • Adds options to Felix and the CNI to delay pods going ready until their dataplane programming is complete. calico #8469 (@aaaaaaaalex)
  • Add global +x permissions to endpoint-status dir (#8633) calico #8641 (@aaaaaaaalex)
  • Typha's typha_breadcrumb_size Prometheus stat now decays to zero if there are no breadcrumbs at all. Previously it would show the last value, or NaN, which were misleading. calico #8614 (@fasaxc)
  • Update the Grafana dashboard for Typha. Tested with Grafana v10.4.0. calico #8613 (@frozenprocess)
  • Remove unnecessary FIPS code calico #8538 (@rene-dekker)
  • Move key-cert-provisioner to the monorepo calico #8475 (@rene-dekker)
  • Host MTU auto-detection now ignores interfaces that are down. calico #8496 (@fasaxc)
  • Improve IPAM block garbage collection behavior for IP pools with small blocks. calico #8454 (@caseydavenport)
  • Clean up: VXLAN ARP and FDB programming is moved to a new sub-component. This should make it easier to maintain. calico #8449 (@fasaxc)
  • Felix now avoids accessing non-Calico IP sets. This reduces the scope for IP set compatibility errors when another app has created an IP set that Calico's version of IP set can't parse. calico #8387 (@mazdakn)
  • Move certificates permissions out of the else-block. calico #8369 (@rene-dekker)
  • Docker images now use COPY instead of ADD as recommended by CIS. Typha no longer relies on the tini init daemon, it handles the common signals internally (and it does not spawn any subprocesses so there is no need for a reaper). calico #8289 (@fasaxc)
  • Only program failsafe rules for IP version of the CIDR calico #8286 (@tomastigera)
  • Felix now breaks up "policy jump rules" into new iptables "policy group chains" by selector. If two endpoints share a common sequence of policies they will share the same group chain, which reduces the number of rules that need to be programmed. calico #8098 (@fasaxc)
  • Felix and Typha now support enabling the Go standard library's debug server via the DebugHost/DebugPort configuration options. This allows process profiling data to be collected more easily. calico #8091 (@fasaxc)
  • Disable IPIP tunnel checksum offload on kernels <v5.7 calico #8031 (@cyclinder)
  • Improve BIRD liveness probe so that it confirms BIRD is responsive over its socket calico #7556 (@caseydavenport)
  • Run calico/apiserver as non-root by default calico #8576 (@hjiawei)
  • Move calicoctl binary to standard executable search path calico #8364 (@hjiawei)

eBPF data plane

  • ebpf: wg6 traffic is allowed even if blocked by policy calico #8755 (@tomastigera)
  • ebpf: clean up stale icmp6 conntrack entries calico #8754 (@tomastigera)
  • ebpf: fixed fd leak calico #8750 (@tomastigera)
  • ebpf: When a pod connects via a service to self, ingress traffic is policed as if it's source is the pod and not the host after MASQ calico #8719 (@tomastigera)
  • ebpf: fixed source IP used by host when CTLB is disabled and loopback device has non-local IP set. calico #8718 (@tomastigera)
  • ebpf: fix map creation during upgrade. calico #8690 (@sridhartigera)
  • ebpf: fix natOutgoing SNAT for icmp6 calico #8688 (@tomastigera)
  • ebpf: fixed source IP used by host when CTLB is disabled and loopback device has non-local IP set. calico #8618 (@tomastigera)
  • ebpf: Update map definitions in programs used in iptables mode to let libbpf v1.0+ load them successfully. calico #8610 (@mazdakn)
  • ebpf: XDP v6 requires Linux kernel 5.18+ (Ubuntu >=22.04) calico #8587 (@sridhartigera)
  • ebpf: host can access self via a service without CTLB calico #8564 (@tomastigera)
  • ebpf: Support dual stack. calico #8509 (@sridhartigera)
  • ebpf: projectcalico.org/natExcludeService=true makes kube-proxy to ignore the service. That allows using node local dns cache. calico #8484 (@tomastigera)
  • ebpf: fixes arm64 build for use with eBPF - Felix is able to enable ebpf (again) calico #8467 (@hjiawei)
  • Fix that Felix could briefly report "ready" in the middle of initialisation, before going "non-ready" again until the dataplane was in-sync. In eBPF mode, Felix will now report non-Ready if it fails to program some BPF programs. Previously, this would only be reported through logging. calico #8506 (@fasaxc)
  • ebpf: fixes possible holes in the list NAT backends if there is a terminating pod. calico #8438 (@tomastigera)
  • ebpf: fixed cleaning of programs and map when switching from ebpf to iptables mode. calico #8415 (@tomastigera)
  • ebpf: align defaultEndpointToHostAction with iptables - do not apply normal -hep policy to wep calico #8388 (@tomastigera)
  • ebpf: fixed pods in nat-outgoing should not SNAT when accessing local host calico #8380 (@tomastigera)
  • ebpf: setting BPFExcludeIPsFromNAT allows node-local dns cache to work calico #8338 (@tomastigera)
  • ebpf: fixed leakage of nodeport healthcheck servers calico #8313 (@tomastigera)
  • ebpf: don't stumble on unknown prog types passed as int in json calico #8295 (@tomastigera)
  • ebpf: ClusterIP reflects InternalTrafficPolicy=Local calico #8259 (@tomastigera)
  • ebpf: fixed policy cleanup after felix restart if a device is not present anymore. calico #8235 (@fasaxc)
  • eBPF: Support many more active policy rules per endpoint+direction. The BPF policy compiler now supports splitting policy programs if they get larger than the kernel would allow. The exact number of policy rules per endpoint depends on the details of the rules but for some real-world examples we see an increase from approximately 2k rules to approx 15k rules per endpoint direction. calico #8230 (@fasaxc)
  • ebpf: kube-proxy ServiceInternalTrafficPolicy is now GA and setting the gate would generate a warning message. calico #8213 (@tomastigera)
  • ebpf: BPFKubeProxyEndpointSlicesEnabled config option is deprecated, has no effect and will be removed. calico #8160 (@tomastigera)
  • ebpf: Config option added for host networked NAT. Change in the configs related to connect time load balancing. calico #8139 (@sridhartigera)
  • ebpf: alternative cgroup2 mount path can be specified by setting CALICO_CGROUP_PATH evn var for node. calico #8085 (@amrut-asm)
  • ebpf: When a pod connects via a service to self, ingress traffic is policed as if it's source is the pod and not the host after MASQ calico #6949 (@tomastigera)
  • ebpf: Use a label to clean up conntrack to terminating UDP backends calico #8480 (@tomastigera)

Helm chart

  • You can now specify kubernetesServiceEndpoint in the helm chart to support windows or eBFP. calico #8443 (@davhdavh)
  • Helm chart now supports specifying priorityClassName in values.yaml calico #8427 (@elsnepal)
  • Support affinity in tigera-operator chart calico #8095 (@gyuho)
  • Ability to set FelixConfiguration via helm chart calico #8559 (@ti-afra)

Windows

  • Added retry mechanism to Windows version retrieval in install-cni to address possible panics when the OS is not ready. calico #8462 (@coutinhop)