Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why can't access local node service using NodePort by eBPF mode on arm64 #6065

Closed
TrevorTaoARM opened this issue May 11, 2022 · 15 comments
Closed
Labels
area/arm64 relates to arm64 area/bpf eBPF Dataplane issues

Comments

@TrevorTaoARM
Copy link
Contributor

TrevorTaoARM commented May 11, 2022

I have a 2 nodes k8s cluster. After enabling eBPF mode in Felix with guide(https://projectcalico.docs.tigera.io/maintenance/ebpf/enabling-bpf),
I deployed a simple k8s Nodeport service and backend Nginx pods with replicas 2.
It seems I can't access the k8s nodeport service from local node itself, but it can be accessed from other nodes.
I checked for the original behavior of kube-proxy, it can be accessed locally and correctly.
So the 2 behaviors here are different.
I wonder if there is any intentional design or setting here for disabling the local access for NodePort service with eBPF.
or anything I had missed here for eBPF mode setting.

Steps to Reproduce

The deployment yaml file:
trevor@vm3-arm-tx2-02:~/projects/k8s-cilium-examples$ cat nginx-app-deployment.yaml
`apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
type: NodePort
ports:

  • port: 80
    protocol: TCP
    name: http
    selector:
    app: nginx

apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
role: backend
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80`

The 2 nodes here: 10.169.210.108(node1), 10.169.210.109(node2)
The generated k8s service:
nginx NodePort 172.16.1.3 80:31064/TCP 48m

From the node1:
$curl 10.169.210.108:31064
curl: (7) Failed to connect to 10.169.210.108 port 31064: Connection refused
$ curl 10.169.210.109:31064

<title>Welcome to nginx!</title> ...

I checked with tcpdump:
sudo tcpdump -i any port 31064
The captured packets show a TCP RST ACK had been sent:
image

Calico version: 3.22.1

@tomastigera
Copy link
Contributor

You are likely hitting this issue #5957, could you try with calico/node:master if you are using operator install you need to annotate the ds with unsupported.operator.tigera.io/ignore: "true" as per https://github.com/tigera/operator#making-temporary-changes-to-components-the-operator-manages

@tomastigera tomastigera added area/bpf eBPF Dataplane issues area/arm64 relates to arm64 labels May 11, 2022
@TrevorTaoARM
Copy link
Contributor Author

You are likely hitting this issue #5957, could you try with calico/node:master if you are using operator install you need to annotate the ds with unsupported.operator.tigera.io/ignore: "true" as per https://github.com/tigera/operator#making-temporary-changes-to-components-the-operator-manages

Thanks. I used the traditional manifest yaml install here. Any more guide to fix it.

@TrevorTaoARM
Copy link
Contributor Author

TrevorTaoARM commented May 12, 2022

I checked the calico-node -bpf information, when I accessed the nodeport service from remote node:
it shows:
# calico-node -bpf conntrack dump |grep 31064
2022-05-12 11:07:17.866 [INFO][2585910] confd/maps.go 308: Loaded map file descriptor. fd=0x9 name="/sys/fs/bpf/tc/globals/cali_v4_ct2"
ConntrackKey{proto=6 10.169.210.108:31064 <-> 10.169.210.109:56218} -> Entry{Type:1, Created:441882834421749, LastSeen:441882839381560, Flags: REVKey: ConntrackKey{proto=6 192.168.51.74:80 <-> 10.169.210.109:56218} NATSPort: 0} Age: 9.169131491s Active ago 9.16417168s
ConntrackKey{proto=6 192.168.51.74:80 <-> 10.169.210.109:56218} -> Entry{Type:2, Created:441882834414418, LastSeen:441882840497847, Flags: ext-local Data: {A2B:{Seqno:1534567773 SynSeen:true AckSeen:true FinSeen:true RstSeen:false Whitelisted:true Opener:false Ifindex:0} B2A:{Seqno:629205172 SynSeen:true AckSeen:true FinSeen:true RstSeen:false Whitelisted:true Opener:true Ifindex:2} OrigDst:10.169.210.108 OrigPort:31064 OrigSPort:0 TunIP:0.0.0.0}} Age: 9.169205962s Active ago 9.163122533s CLOSED

Here the local node IP is 10.169.210.108, remote node IP is 10.169.210.109. The NodePort is 31064.

When I accessed the nodeport service locallly, it shows nothing:
# calico-node -bpf conntrack dump |grep 31064
2022-05-12 11:12:38.044 [INFO][2587966] confd/maps.go 308: Loaded map file descriptor. fd=0x9 name="/sys/fs/bpf/tc/globals/cali_v4_ct2"

I also checked nat route map:
#calico-node -bpf nat dump
2022-05-13 03:28:25.557 [INFO][2959979] confd/maps.go 308: Loaded map file descriptor. fd=0x9 name="/sys/fs/bpf/tc/globals/cali_v4_nat_fe3"
2022-05-13 03:28:25.558 [INFO][2959979] confd/maps.go 308: Loaded map file descriptor. fd=0xa name="/sys/fs/bpf/tc/globals/cali_v4_nat_be"
...
10.169.210.108 port 31064 proto 6 id 5 count 2 local 1
5:0 192.168.51.74:80
5:1 192.168.169.16:80
It exists correctly.
So I still don't know why the local access for NodePort service can't be found in the conntrack dump and reach the target pods.

@TrevorTaoARM
Copy link
Contributor Author

Sorry, I forget to add it now runs on arm64, not amd64.

@tomastigera
Copy link
Contributor

tomastigera commented May 12, 2022

Sorry, noticed that it is arm, but gave you a wrong image 😓 calico/node:master-amd64 It seems like you are hitting that issue.

@TrevorTaoARM
Copy link
Contributor Author

Sorry, noticed that it is arm, but gave you a wrong image 😓 calico/node:master-amd64 It seems like you are hitting that issue.

NP. I created a eBPF based e2e-test environment on arm64, which changed 2 files:
node/tests/k8st/deploy_resources_on_kind_cluster.sh and
node/tests/k8st/infra/calico-kdd-ebpf.yaml (Modified from calico-kdd.yaml)

But it seems the DNS resolution can't work in this environment:

  • echo 'Calico apiserver is running.'
    Calico apiserver is running.
  • ../hack/test/kind/kubectl get po --all-namespaces -o wide
    NAMESPACE NAME READY STATUS RESTARTS NOMINATED NODE READINESS GATES
    calico-apiserver calico-apiserver-5fbd4b85-crbg9 1/1 Running 0
    default client 1/1 Running 0
    default webserver-d4b6bf69f-4l9b7 1/1 Running 0
    default webserver-d4b6bf69f-sslbf 1/1 Running 0
    kube-system calico-kube-controllers-7dddb8f77f-tlpsw 1/1 Running 0
    kube-system calico-node-6jr7c 1/1 Running 0
    kube-system calico-node-6m666 1/1 Running 0
    kube-system calico-node-g8zz4 1/1 Running 0
    kube-system calico-node-wr6c6 1/1 Running 0
    kube-system calicoctl 1/1 Running 0
    kube-system coredns-64897985d-6nl6s 1/1 Running 0
    kube-system coredns-64897985d-cwpqj 1/1 Running 0
    kube-system etcd-kind-control-plane 1/1 Running 0
    kube-system kube-apiserver-kind-control-plane 1/1 Running 0
    kube-system kube-controller-manager-kind-control-plane 1/1 Running 0
    kube-system kube-scheduler-kind-control-plane 1/1 Running 0
    local-path-storage local-path-provisioner-5ddd94ff66-8xdt5 1/1 Running 0
    metallb-system controller-57c458c998-wtcl4 1/1 Running 0
  • ../hack/test/kind/kubectl get svc
    NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
    kubernetes ClusterIP 10.96.0.1 443/TCP 8m8s
    webserver-ipv4 NodePort 10.96.18.80 80:30696/TCP 49s
    webserver-ipv6 NodePort fd00:10:96::aa59 80:32316/TCP 49s
  • test_connection 4
  • local svc=webserver-ipv4
    ++ ../hack/test/kind/kubectl exec client -- wget webserver-ipv4 -T 5 -O -
    wget: bad address 'webserver-ipv4'
    command terminated with exit code 1
  • output=
    Makefile:373: recipe for target 'kind-k8st-setup' failed
    make[1]: *** [kind-k8st-setup] Error 1
    make[1]: Leaving directory '/mnt/sda2/home/trevor/projects/calico-src/calico-master/node'
    Makefile:62: recipe for target 'e2e-test' failed
    make: *** [e2e-test] Error 2

@TrevorTaoARM TrevorTaoARM changed the title Why can't access local node service using NodePort by eBPF mode Why can't access local node service using NodePort by eBPF mode on arm64 May 13, 2022
@TrevorTaoARM
Copy link
Contributor Author

TrevorTaoARM commented May 13, 2022

I checked it on x86_64, it can work correctly.
It seems for the local svc access, there is no log information for Conntrack in the calico-node log output (in Debug level).
For the remote access, the output of x86 calico-node log for Conntrack has a lot of information:
2022-05-13 14:32:11.037 [DEBUG][4006] felix/scanner.go 100: Examining conntrack entry entry=Entry{Type:1, Created:22079267163153, LastSeen:22079268855741, Flags: REVKey: ConntrackKey{proto=6 192.168.136.201:80 <-> 10.169.208.235:44822} NATSPort: 0} key=ConntrackKey{proto=6 10.169.210.127:31395 <-> 10.169.208.235:44822}
...

It seems the case of "can't access the local nodeport service" is only existed on arm64.
So I would like to ask if there's any debug means to find the route or Conntrack information for the local nodeport service access on eBPF mode.

@tomastigera
Copy link
Contributor

It seems the case of "can't access the local nodeport service" is only existed on arm64
I wonder why that would be 🤔

If you exec into the calico-node pod and do bptfool prog show would you see any cgroup related cali programs?

@tomastigera
Copy link
Contributor

We are still talking here that a host networked pod / the node itself is trying to connect to a nodeport / service right?

@TrevorTaoARM
Copy link
Contributor Author

Yes, here is just about the node itself is trying to connect to a nodeport service.

@TrevorTaoARM
Copy link
Contributor Author

It seems the case of "can't access the local nodeport service" is only existed on arm64
I wonder why that would be 🤔

If you exec into the calico-node pod and do bptfool prog show would you see any cgroup related cali programs?

I can't see any cgroup related cali programs on arm64. But they can be found on x86_64.
It may be the difference for the 2 arches.

@TrevorTaoARM
Copy link
Contributor Author

I checked the logs of Calico-node pod, it seems the pinned map "cali_v4_ct_nats" leads progs can't attach to cgroup:
2022-05-08 03:41:47.778 [DEBUG][241699] felix/connecttime.go 127: Pinned map map="cali_v4_ct_nats" program="calico_connect_v4"
libbpf: prog 'calico_connect_v4': failed to attach to cgroup: Invalid argument
...
2022-05-08 03:41:47.783 [DEBUG][241699] felix/connecttime.go 127: Pinned map map="cali_v4_ct_nats" program="calico_sendmsg_v4"
libbpf: prog 'calico_sendmsg_v4': failed to attach to cgroup: Invalid argument
...
2022-05-08 03:41:47.786 [DEBUG][241699] felix/connecttime.go 127: Pinned map map="cali_v4_ct_nats" program="calico_recvmsg_v4"
libbpf: prog 'calico_recvmsg_v4': failed to attach to cgroup: Invalid argument
...
2022-05-08 03:41:47.789 [DEBUG][241699] felix/connecttime.go 127: Pinned map map="cali_v4_ct_nats" program="calico_sendmsg_v6"
libbpf: prog 'calico_sendmsg_v6': failed to attach to cgroup: Invalid argument

@TrevorTaoARM
Copy link
Contributor Author

TrevorTaoARM commented May 28, 2022

I think I had found the root cause of this issue:

  1. It's mainly due to the error: " libbpf: prog 'calico_connect_v4': failed to attach to cgroup: Invalid argument" which was showed in calico-node log output.
  2. Mainly it's only existed in Linux kernel v5.4 (and less than), and the syscall to
    BPF_LINK_CREATE is available in the kernel v5.8 (or above), which is called by bpf_program_attach_cgroup.
  3. There is a bug in Calico v3.22.1, for the return value of bpf_program_attach_cgroup() will always be true, so it will not call bpf_program_attach_cgroup_legacy() which is needed for kernel v5.4(and less than).
  4. We checked the nodePort service after merging the commit bd9ec65 which is available in Calico v3.23.1, it can be accessed correctly on the local node now.
  5. Both x86_64 and arm64 show the same result (nodePort service for local node not available) for kernel v5.4 and before the commit bd9ec65.

@TrevorTaoARM
Copy link
Contributor Author

If you think my analysis is correct here, the issue can be closed.

@tomastigera
Copy link
Contributor

Sounds right, the fix wa also cherry-piched for 3.22.3 #6056

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/arm64 relates to arm64 area/bpf eBPF Dataplane issues
Projects
None yet
Development

No branches or pull requests

2 participants