Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KubeSpray - Cannot access local node services when using eBPF #7252

Closed
Aslan-Liu opened this issue Jan 30, 2023 · 46 comments · Fixed by #8380 or #8388
Closed

KubeSpray - Cannot access local node services when using eBPF #7252

Aslan-Liu opened this issue Jan 30, 2023 · 46 comments · Fixed by #8380 or #8388
Assignees
Labels
area/bpf eBPF Dataplane issues kind/support

Comments

@Aslan-Liu
Copy link

Aslan-Liu commented Jan 30, 2023

I have one cluster and also install Prometheus in it. So, each node has a service (HostIP:9100) to export node information. However, if I run one Pod on Node1(Node1 Host IP: 172.21.149.119), I cannot access 172.21.149.119:9100 in the Pod. But I can access other services on the other nodes, such as 172.21.149.xx:9100 .

Expected Behavior

All local serivces on every node can be accessed in the Pod.

Current Behavior

Now, only local services run on different nodes with my Pod can be accessed in my Pod.

Possible Solution

#6065

Steps to Reproduce (for bugs)

  1. Install Calico v3.23.1 or v3.22.3
  2. Install Prometheus
  3. Run a Pod on any node
  4. In the Pod, use curl to access HostIP:9100 (HostIP is the node IP where the Pod runs on.)

Context

I also can see the following log messages in calico-node Pod.

libbpf: prog 'calico_connect_v4': failed to attach to cgroup: Invalid argument
2023-01-30 12:01:50.409 [INFO][129] felix/connecttime.go 146: Loaded cgroup program cgroup="/run/calico/cgroup" program="calico_connect_v4"
libbpf: prog 'calico_sendmsg_v4': failed to attach to cgroup: Invalid argument
2023-01-30 12:01:50.499 [INFO][129] felix/connecttime.go 146: Loaded cgroup program cgroup="/run/calico/cgroup" program="calico_sendmsg_v4"
libbpf: prog 'calico_recvmsg_v4': failed to attach to cgroup: Invalid argument
2023-01-30 12:01:50.501 [INFO][129] felix/connecttime.go 146: Loaded cgroup program cgroup="/run/calico/cgroup" program="calico_recvmsg_v4"
libbpf: prog 'calico_sendmsg_v6': failed to attach to cgroup: Invalid argument
2023-01-30 12:01:50.502 [INFO][129] felix/connecttime.go 146: Loaded cgroup program cgroup="/run/calico/cgroup" program="calico_sendmsg_v6"
libbpf: prog 'calico_recvmsg_v6': failed to attach to cgroup: Invalid argument

I think this problem is very similar as #6065. However, the problem still exists after upgrading Calico to v3.23.1.

Your Environment

  • Calico version: v3.25.1
  • Orchestrator version (e.g. kubernetes, mesos, rkt): kubernetes 1.26.5
  • Operating System and version: Ubuntu 22.04 LTS

Someone can help me?

@coutinhop coutinhop added the area/bpf eBPF Dataplane issues label Feb 6, 2023
@tomastigera
Copy link
Contributor

Sorry for following late, missed this issue, have you made any progress?

Do I understand correctly that you have a nodeport 9100 and you cannot access it via that nodeport? Of you have just a process listening on each node on port 9100? And trying to connect to the same node's IP on that port from a local pod does not work?

@Aslan-Liu
Copy link
Author

I did a test again.

First, I ran a nginx pod on worker2.
Second, run "nc -l worker2 9999" on worker2 natively.

Everything is done. After that, I entered nginx pod and run "curl http://worker2:9999". It doesn't work.
However, if I ran nginx pod on the other nodes, it works.

@spantazi
Copy link

Same issue here with calico version 3.25.1 eBPF (Kubernetes in amd64, Ubuntu 22.04 LTS nodes).
Pods of a node cannot connect to NodePorts listeners on their host IP address.
Noticed that both metrics server and prometheus where unable to fetch metrics from those worker nodes that were hosting the corresponding collection pods, as those pods could not connect to own host's IP:10250 (kubelet).

@tomastigera
Copy link
Contributor

@spantazi would you be able to provide more information and ideally some logs? Would you be able to follow this guide and gather ebpf logs on that node?

What is your k8s environment? Some managed k8s or your own? Does the issue persist with 3.26.1?

@spantazi
Copy link

@tomastigera
Upstream kubernetes (1.27.5) installed with kubespray.
Will try to gather some logs according to your guide and get back to you.

@tomastigera
Copy link
Contributor

tomastigera commented Sep 19, 2023

@spantazi thank you, if you have troubles getting the logs, we could connect on calico users slack #ebpf and I can help out with that.

@Aslan-Liu
Copy link
Author

I am so glad to hear someone has the same problem with me. My k8s is also built up by Kubespray v2.22 (.k8s 1.26.5), and use calico 3.25.1.
However, if I use Kubespray v2.23 with calico 3.26.1, all calico pods will keep crashing. I think this may be an issue about kubespray. Currently Kubespray v2.23 only supports calico to v3.25.2.

@Aslan-Liu
Copy link
Author

Aslan-Liu commented Sep 20, 2023

@tomastigera I follow your instruction to get logs. But it is too long, I just post a partial logs here.

Logs <Part 1>

      <idle>-0       [003] d.s..  5048.968164: bpf_trace_printk: ens192----------I: No metadata is shared by XDP

      <idle>-0       [003] d.s..  5048.968166: bpf_trace_printk: ens192----------I: IP id=41501

      <idle>-0       [003] d.s..  5048.968168: bpf_trace_printk: ens192----------I: TCP; ports: s=47296 d=22

      <idle>-0       [003] d.s..  5048.968169: bpf_trace_printk: ens192----------I: CT: lookup from ac15904a:47296

      <idle>-0       [003] d.s..  5048.968169: bpf_trace_printk: ens192----------I: CT: lookup to   ac15939c:22

      <idle>-0       [003] d.s..  5048.968170: bpf_trace_printk: ens192----------I: CT: Hit! NORMAL entry.

      <idle>-0       [003] d.s..  5048.968171: bpf_trace_printk: ens192----------I: CT: result: 0x2

      <idle>-0       [003] d.s..  5048.968172: bpf_trace_printk: ens192----------I: conntrack entry flags 0x0

      <idle>-0       [003] d.s..  5048.968172: bpf_trace_printk: ens192----------I: CT Hit

      <idle>-0       [003] d.s..  5048.968173: bpf_trace_printk: ens192----------I: Entering calico_tc_skb_accepted_entrypoint

      <idle>-0       [003] d.s..  5048.968174: bpf_trace_printk: ens192----------I: Entering calico_tc_skb_accepted

      <idle>-0       [003] d.s..  5048.968175: bpf_trace_printk: ens192----------I: src=ac15904a dst=ac15939c

      <idle>-0       [003] d.s..  5048.968175: bpf_trace_printk: ens192----------I: post_nat=0:0

      <idle>-0       [003] d.s..  5048.968176: bpf_trace_printk: ens192----------I: tun_ip=0

      <idle>-0       [003] d.s..  5048.968176: bpf_trace_printk: ens192----------I: pol_rc=1

      <idle>-0       [003] d.s..  5048.968176: bpf_trace_printk: ens192----------I: sport=47296

      <idle>-0       [003] d.s..  5048.968177: bpf_trace_printk: ens192----------I: flags=20

      <idle>-0       [003] d.s..  5048.968179: bpf_trace_printk: ens192----------I: ct_rc=2

      <idle>-0       [003] d.s..  5048.968179: bpf_trace_printk: ens192----------I: ct_related=0

      <idle>-0       [003] d.s..  5048.968180: bpf_trace_printk: ens192----------I: mark=0x1000000

      <idle>-0       [003] d.s..  5048.968180: bpf_trace_printk: ens192----------I: ip->ttl 63

      <idle>-0       [003] d.s..  5048.968181: bpf_trace_printk: ens192----------I: FIB family=2

      <idle>-0       [003] d.s..  5048.968182: bpf_trace_printk: ens192----------I: FIB tot_len=0

      <idle>-0       [003] d.s..  5048.968182: bpf_trace_printk: ens192----------I: FIB ifindex=2

      <idle>-0       [003] d.s..  5048.968183: bpf_trace_printk: ens192----------I: FIB l4_protocol=6

      <idle>-0       [003] d.s..  5048.968183: bpf_trace_printk: ens192----------I: FIB sport=47296

      <idle>-0       [003] d.s..  5048.968184: bpf_trace_printk: ens192----------I: FIB dport=22

      <idle>-0       [003] d.s..  5048.968184: bpf_trace_printk: ens192----------I: FIB ipv4_src=ac15904a

      <idle>-0       [003] d.s..  5048.968184: bpf_trace_printk: ens192----------I: FIB ipv4_dst=ac15939c

      <idle>-0       [003] d.s..  5048.968185: bpf_trace_printk: ens192----------I: Traffic is towards the host namespace, doing Linux FIB lookup

      <idle>-0       [003] d.s..  5048.968188: bpf_trace_printk: ens192----------I: FIB lookup failed (FIB problem): 4.

      <idle>-0       [003] d.s..  5048.968188: bpf_trace_printk: ens192----------I: Traffic is towards host namespace, marking with 0x1000000.

      <idle>-0       [003] d.s..  5048.968189: bpf_trace_printk: ens192----------I: Final result=ALLOW (0). Program execution time: 23051ns

Logs <Part 2>

        bash-17756   [003] d.s1.  5039.444510: bpf_trace_printk: calic440f455693-E: New packet at ifindex=5; mark=0

        bash-17756   [003] d.s1.  5039.444515: bpf_trace_printk: calic440f455693-E: ARP: allowing packet

        bash-17756   [003] d.s1.  5039.444516: bpf_trace_printk: calic440f455693-E: Traffic is towards host namespace, marking with 0x1000000.

        bash-17756   [003] d.s1.  5039.444517: bpf_trace_printk: calic440f455693-E: Final result=ALLOW (0). Program execution time: 2135ns

        bash-17756   [003] d.s1.  5039.444540: bpf_trace_printk: calic440f455693-I: New packet at ifindex=5; mark=0

        bash-17756   [003] d.s1.  5039.444544: bpf_trace_printk: calic440f455693-I: ARP: allowing packet

        bash-17756   [003] d.s1.  5039.444545: bpf_trace_printk: calic440f455693-I: Final result=ALLOW (0). Program execution time: 706ns

        bash-17756   [003] d.s1.  5039.444551: bpf_trace_printk: calic440f455693-E: New packet at ifindex=5; mark=0

        bash-17756   [003] d.s1.  5039.444553: bpf_trace_printk: calic440f455693-E: IP id=19994

        bash-17756   [003] d.s1.  5039.444554: bpf_trace_printk: calic440f455693-E: TCP; ports: s=42894 d=9999

        bash-17756   [003] d.s1.  5039.444555: bpf_trace_printk: calic440f455693-E: CT: lookup from f009ad41:42894

        bash-17756   [003] d.s1.  5039.444555: bpf_trace_printk: calic440f455693-E: CT: lookup to   ac15939c:9999

        bash-17756   [003] d.s1.  5039.444556: bpf_trace_printk: calic440f455693-E: CT: Miss for TCP SYN, NEW flow.

        bash-17756   [003] d.s1.  5039.444557: bpf_trace_printk: calic440f455693-E: CT: result: NEW.

        bash-17756   [003] d.s1.  5039.444558: bpf_trace_printk: calic440f455693-E: conntrack entry flags 0x0

        bash-17756   [003] d.s1.  5039.444559: bpf_trace_printk: calic440f455693-E: NAT: 1st level lookup addr=ac15939c port=9999 tcp

        bash-17756   [003] d.s1.  5039.444560: bpf_trace_printk: calic440f455693-E: NAT: Miss.

        bash-17756   [003] d.s1.  5039.444560: bpf_trace_printk: calic440f455693-E: NAT: nodeport miss

        bash-17756   [003] d.s1.  5039.444563: bpf_trace_printk: calic440f455693-E: Workload RPF check src=f009ad41 skb iface=5.

        bash-17756   [003] d.s1.  5039.444564: bpf_trace_printk: calic440f455693-E: Source is in NAT-outgoing pool but dest is not, need to SNAT.

        bash-17756   [003] d.s1.  5039.444565: bpf_trace_printk: calic440f455693-E: Socket cookie: 1059

        bash-17756   [003] d.s1.  5039.444566: bpf_trace_printk: calic440f455693-E: Post-NAT dest IP is local host.

        bash-17756   [003] d.s1.  5039.444567: bpf_trace_printk: calic440f455693-E: About to jump to policy program.

        bash-17756   [003] d.s1.  5039.444568: bpf_trace_printk: calic440f455693-E: Entering calico_tc_skb_drop

        bash-17756   [003] d.s1.  5039.444570: bpf_trace_printk: calic440f455693-E: proto=6

        bash-17756   [003] d.s1.  5039.444571: bpf_trace_printk: calic440f455693-E: src=f009ad41 dst=ac15939c

        bash-17756   [003] d.s1.  5039.444572: bpf_trace_printk: calic440f455693-E: pre_nat=ac15939c:9999

        bash-17756   [003] d.s1.  5039.444573: bpf_trace_printk: calic440f455693-E: post_nat=ac15939c:9999

        bash-17756   [003] d.s1.  5039.444573: bpf_trace_printk: calic440f455693-E: tun_ip=0

        bash-17756   [003] d.s1.  5039.444574: bpf_trace_printk: calic440f455693-E: pol_rc=2

        bash-17756   [003] d.s1.  5039.444574: bpf_trace_printk: calic440f455693-E: sport=42894

        bash-17756   [003] d.s1.  5039.444575: bpf_trace_printk: calic440f455693-E: flags=0x5

        bash-17756   [003] d.s1.  5039.444575: bpf_trace_printk: calic440f455693-E: ct_rc=0

        bash-17756   [003] d.s1.  5039.444576: bpf_trace_printk: calic440f455693-E: DENY due to policy

Is it enough for debugging? (Sorry, I don't know how to upload the complete logs here.)
Destination port 9999 is my target service.

I use Calico 3.25.1 and 3.25.2 to test this issue. Both of them have the same problem.

@lwr20
Copy link
Member

lwr20 commented Sep 20, 2023

Sorry, I don't know how to upload the complete logs here.

Consider uploading the file as a Gist at https://gist.github.com/ and link it here?

@Aslan-Liu
Copy link
Author

Sorry, I don't know how to upload the complete logs here.

Consider uploading the file as a Gist at https://gist.github.com/ and link it here?

Sorry, I can't. My network is controlled by my company. I can't upload any file to other sites.

@lwr20
Copy link
Member

lwr20 commented Sep 20, 2023

Can you cut-and-paste it in?

@Aslan-Liu
Copy link
Author

Can you cut-and-paste it in?

No, it totally has 15548 lines. It's too long. However, I see "DENY due to policy" in log messages. Is it possible related to #7707 ? But I also upgrade to Calico v3.25.2, this issue still exists.

@tomastigera
Copy link
Contributor

@Aslan-Liu thanks for the logs. It is a firehose, but that is going to change with 3.27 😅

You said that the port 9999 is a nodeport. But from the logs is does not seem like it is treated like one:

        bpf_trace_printk: calic440f455693-E: New packet at ifindex=5; mark=0
        bpf_trace_printk: calic440f455693-E: IP id=19994
        bpf_trace_printk: calic440f455693-E: TCP; ports: s=42894 d=9999
        bpf_trace_printk: calic440f455693-E: CT: lookup from f009ad41:42894
        bpf_trace_printk: calic440f455693-E: CT: lookup to   ac15939c:9999
        bpf_trace_printk: calic440f455693-E: CT: Miss for TCP SYN, NEW flow.
        bpf_trace_printk: calic440f455693-E: CT: result: NEW.
        bpf_trace_printk: calic440f455693-E: conntrack entry flags 0x0
        bpf_trace_printk: calic440f455693-E: NAT: 1st level lookup addr=ac15939c port=9999 tcp
        bpf_trace_printk: calic440f455693-E: NAT: Miss.
        bpf_trace_printk: calic440f455693-E: NAT: nodeport miss

Could you share output from kubectl exec -n calico-system calico-node-XYZ -- calico-node -bpf nat dump, as described in the same guide, on this node. It will show how services are understood by the dataplane. The output looks somewhat like this:

       10.101.0.10 port 80 proto 6 id 1 count 1 local 0
               1:0      10.65.0.2:8055
       172.18.0.12 port 30333 proto 6 id 1 count 1 local 0
               1:0      10.65.0.2:8055
       255.255.255.255 port 30333 proto 6 id 1 count 1 local 0
               1:0      10.65.0.2:8055
       10.101.0.1 port 443 proto 6 id 0 count 1 local 0
                0:0      172.18.0.5:6443

We are looking for entries with port 9999, specially 255.255.255.255 port 9999 and 172..21.147.156 which is the node IP that the log above is trying to use (0xac15939c)

As for the DENY due to policy it is likely because your policies do not allow your pods to access a node on this port. Which if the service resolution was fine, would not happen, because you would see the service backend's IP:port. So it seems like the policy is actually doing what it is supposed to do.

@Aslan-Liu
Copy link
Author

Aslan-Liu commented Sep 21, 2023

@tomastigera Actually, port 9999 is not a node port. I just run a http server on the host and listen port 9999, so I think we also cannot see port 9999 in the output of 'kubectl exec -n calico-system calico-node-XYZ -- calico-node -bpf nat dump'.

Am I right? Or do you still need the output of 'kubectl exec -n calico-system calico-node-XYZ -- calico-node -bpf nat dump'? Or other information do you need?

Just let me know.

Thanks

@tomastigera
Copy link
Contributor

Actually, port 9999 is not a node port. I just run a http server on the host and listen port 9999

Ohhh, sorry for misunderstanding - service is a super overloaded term 🤷‍♂️ In that case, the nat dump is pointless.

What you can do though is to dump the policy on that interface using kubectl exec -n calico-system calico-node-XYZ -- calico-node -bpf policy dump <dev> all. It will dump both ingress and egress policy. It describes the rules and also includes counters to show which rules are hit. Note that some rules use ipsets. You can dump them as well using ipsets dump.

@Aslan-Liu
Copy link
Author

Aslan-Liu commented Sep 22, 2023

@tomastigera Here are the output from my environment

# calico-node -bpf policy dump calic440f455693 all
IfaceName: calic440f455693
Hook: tc egress
Error:
Policy Info:
start:
      bf16000000000000 Mov64 dst=R6 src=R1 off=0 imm=0x00000000/0
      b701000000000000 MovImm64 dst=R1 src=R0 off=0 imm=0x00000000/0
      631afcff00000000 StoreReg32 dst=R10 src=R1 off=-4 imm=0x00000000/0
      bfa2000000000000 Mov64 dst=R2 src=R10 off=0 imm=0x00000000/0
      07020000fcffffff AddImm64 dst=R2 src=R0 off=0 imm=0xfffffffc/-4
// Load packet metadata saved by previous program
      181100000c000000 LoadImm64 dst=R1 src=R1 off=0 imm=0x0000000c/12
      0000000000000000 LoadImm64Pt2 dst=R0 src=R0 off=0 imm=0x00000000/0
      8500000001000000 Call dst=R0 src=R0 off=0 imm=0x00000001/1                                       call bpf_map_lookup_elem
      1500110000000000 JumpEqImm64 dst=R0 src=R0 off=17 imm=0x00000000/0                               goto exit
// Save state pointer in register R9
      bf09000000000000 Mov64 dst=R9 src=R0 off=0 imm=0x00000000/0
policy:
      7991980100000000 LoadReg64 dst=R1 src=R9 off=408 imm=0x00000000/0                                R1 = *(u64 *)(R9 + 408) /* state->flags */
      570100000c000000 AndImm64 dst=R1 src=R0 off=0 imm=0x0000000c/12
      5501010000000000 JumpNEImm64 dst=R1 src=R0 off=1 imm=0x00000000/0                                goto allowed_by_host_policy
      0500000000000000 JumpA dst=R0 src=R0 off=0 imm=0x00000000/0                                      goto allowed_by_host_policy
// Start of rule action:"allow" rule_id:"aBMQCbsUMESPKGRp"
// count = 0
allowed_by_host_policy:
      7191640000000000 LoadReg8 dst=R1 src=R9 off=100 imm=0x00000000/0                                 R1 = *(u8 *)(R9 + 100) /* state->rules_hit */
      35010c0020000000 JumpGEImm64 dst=R1 src=R0 off=12 imm=0x00000020/32                              goto allow
      bf12000000000000 Mov64 dst=R2 src=R1 off=0 imm=0x00000000/0
      0702000001000000 AddImm64 dst=R2 src=R0 off=0 imm=0x00000001/1
      7329640000000000 StoreReg8 dst=R9 src=R2 off=100 imm=0x00000000/0                                *(u8 *) (R9 + 100) /* state->rules_hit */ = R2
      6701000003000000 ShiftLImm64 dst=R1 src=R0 off=0 imm=0x00000003/3
      0701000068000000 AddImm64 dst=R1 src=R0 off=0 imm=0x00000068/104
      1802000037ab8758 LoadImm64 dst=R2 src=R0 off=0 imm=0x5887ab37/1485286199
      0000000080e3d4e6 LoadImm64Pt2 dst=R0 src=R0 off=0 imm=0xe6d4e380/-422255744
      0f91000000000000 Add64 dst=R1 src=R9 off=0 imm=0x00000000/0
      7b21000000000000 StoreReg64 dst=R1 src=R2 off=0 imm=0x00000000/0                                 *(u64 *) (R1 + 0) /*  */ = R2
      0500020000000000 JumpA dst=R0 src=R0 off=2 imm=0x00000000/0                                      goto allow
// End of rule aBMQCbsUMESPKGRp
exit:
      b700000002000000 MovImm64 dst=R0 src=R0 off=0 imm=0x00000002/2
      9500000000000000 Exit dst=R0 src=R0 off=0 imm=0x00000000/0
allow:
      b401000001000000 MovImm32 dst=R1 src=R0 off=0 imm=0x00000001/1
      6319540000000000 StoreReg32 dst=R9 src=R1 off=84 imm=0x00000000/0                                *(u32 *) (R9 + 84) /* state->pol_rc */ = R1
      bf61000000000000 Mov64 dst=R1 src=R6 off=0 imm=0x00000000/0
      1812000046000000 LoadImm64 dst=R2 src=R1 off=0 imm=0x00000046/70
      0000000000000000 LoadImm64Pt2 dst=R0 src=R0 off=0 imm=0x00000000/0
      b403000003000000 MovImm32 dst=R3 src=R0 off=0 imm=0x00000003/3
      850000000c000000 Call dst=R0 src=R0 off=0 imm=0x0000000c/12                                      call bpf_tail_call
      b40100000a000000 MovImm32 dst=R1 src=R0 off=0 imm=0x0000000a/10
      6319540000000000 StoreReg32 dst=R9 src=R1 off=84 imm=0x00000000/0                                *(u32 *) (R9 + 84) /* state->pol_rc */ = R1
      b700000002000000 MovImm64 dst=R0 src=R0 off=0 imm=0x00000002/2
      9500000000000000 Exit dst=R0 src=R0 off=0 imm=0x00000000/0
IfaceName: calic440f455693
Hook: tc ingress
Error:
Policy Info:
start:
      bf16000000000000 Mov64 dst=R6 src=R1 off=0 imm=0x00000000/0
      b701000000000000 MovImm64 dst=R1 src=R0 off=0 imm=0x00000000/0
      631afcff00000000 StoreReg32 dst=R10 src=R1 off=-4 imm=0x00000000/0
      bfa2000000000000 Mov64 dst=R2 src=R10 off=0 imm=0x00000000/0
      07020000fcffffff AddImm64 dst=R2 src=R0 off=0 imm=0xfffffffc/-4
// Load packet metadata saved by previous program
      181100000c000000 LoadImm64 dst=R1 src=R1 off=0 imm=0x0000000c/12
      0000000000000000 LoadImm64Pt2 dst=R0 src=R0 off=0 imm=0x00000000/0
      8500000001000000 Call dst=R0 src=R0 off=0 imm=0x00000001/1                                       call bpf_map_lookup_elem
      1500240000000000 JumpEqImm64 dst=R0 src=R0 off=36 imm=0x00000000/0                               goto exit
// Save state pointer in register R9
      bf09000000000000 Mov64 dst=R9 src=R0 off=0 imm=0x00000000/0
policy:
      7991980100000000 LoadReg64 dst=R1 src=R9 off=408 imm=0x00000000/0                                R1 = *(u64 *)(R9 + 408) /* state->flags */
      570100000c000000 AndImm64 dst=R1 src=R0 off=0 imm=0x0000000c/12
      5501010000000000 JumpNEImm64 dst=R1 src=R0 off=1 imm=0x00000000/0                                goto to_or_from_host
      05000c0000000000 JumpA dst=R0 src=R0 off=12 imm=0x00000000/0                                     goto allowed_by_host_policy
to_or_from_host:
      7191640000000000 LoadReg8 dst=R1 src=R9 off=100 imm=0x00000000/0                                 R1 = *(u8 *)(R9 + 100) /* state->rules_hit */
      3501160020000000 JumpGEImm64 dst=R1 src=R0 off=22 imm=0x00000020/32                              goto deny
      bf12000000000000 Mov64 dst=R2 src=R1 off=0 imm=0x00000000/0
      0702000001000000 AddImm64 dst=R2 src=R0 off=0 imm=0x00000001/1
      7329640000000000 StoreReg8 dst=R9 src=R2 off=100 imm=0x00000000/0                                *(u8 *) (R9 + 100) /* state->rules_hit */ = R2
      6701000003000000 ShiftLImm64 dst=R1 src=R0 off=0 imm=0x00000003/3
      0701000068000000 AddImm64 dst=R1 src=R0 off=0 imm=0x00000068/104
      1802000000000000 LoadImm64 dst=R2 src=R0 off=0 imm=0x00000000/0
      0000000000000000 LoadImm64Pt2 dst=R0 src=R0 off=0 imm=0x00000000/0
      0f91000000000000 Add64 dst=R1 src=R9 off=0 imm=0x00000000/0
      7b21000000000000 StoreReg64 dst=R1 src=R2 off=0 imm=0x00000000/0                                 *(u64 *) (R1 + 0) /*  */ = R2
      05000c0000000000 JumpA dst=R0 src=R0 off=12 imm=0x00000000/0                                     goto deny
// Start of rule action:"allow" rule_id:"8iYOzpfn3SU3eATK"
// count = 0
rule_0_no_match:
allowed_by_host_policy:
      7191640000000000 LoadReg8 dst=R1 src=R9 off=100 imm=0x00000000/0                                 R1 = *(u8 *)(R9 + 100) /* state->rules_hit */
      3501130020000000 JumpGEImm64 dst=R1 src=R0 off=19 imm=0x00000020/32                              goto allow
      bf12000000000000 Mov64 dst=R2 src=R1 off=0 imm=0x00000000/0
      0702000001000000 AddImm64 dst=R2 src=R0 off=0 imm=0x00000001/1
      7329640000000000 StoreReg8 dst=R9 src=R2 off=100 imm=0x00000000/0                                *(u8 *) (R9 + 100) /* state->rules_hit */ = R2
      6701000003000000 ShiftLImm64 dst=R1 src=R0 off=0 imm=0x00000003/3
      0701000068000000 AddImm64 dst=R1 src=R0 off=0 imm=0x00000068/104
      18020000d9e1f7ff LoadImm64 dst=R2 src=R0 off=0 imm=0xfff7e1d9/-532007
      0000000087e4c74d LoadImm64Pt2 dst=R0 src=R0 off=0 imm=0x4dc7e487/1304945799
      0f91000000000000 Add64 dst=R1 src=R9 off=0 imm=0x00000000/0
      7b21000000000000 StoreReg64 dst=R1 src=R2 off=0 imm=0x00000000/0                                 *(u64 *) (R1 + 0) /*  */ = R2
      0500090000000000 JumpA dst=R0 src=R0 off=9 imm=0x00000000/0                                      goto allow
// End of rule 8iYOzpfn3SU3eATK
rule_2_no_match:
deny:
      b401000002000000 MovImm32 dst=R1 src=R0 off=0 imm=0x00000002/2
      6319540000000000 StoreReg32 dst=R9 src=R1 off=84 imm=0x00000000/0                                *(u32 *) (R9 + 84) /* state->pol_rc */ = R1
      bf61000000000000 Mov64 dst=R1 src=R6 off=0 imm=0x00000000/0
      1812000048000000 LoadImm64 dst=R2 src=R1 off=0 imm=0x00000048/72
      0000000000000000 LoadImm64Pt2 dst=R0 src=R0 off=0 imm=0x00000000/0
      b403000005000000 MovImm32 dst=R3 src=R0 off=0 imm=0x00000005/5
      850000000c000000 Call dst=R0 src=R0 off=0 imm=0x0000000c/12                                      call bpf_tail_call
exit:
      b700000002000000 MovImm64 dst=R0 src=R0 off=0 imm=0x00000002/2
      9500000000000000 Exit dst=R0 src=R0 off=0 imm=0x00000000/0
allow:
      b401000001000000 MovImm32 dst=R1 src=R0 off=0 imm=0x00000001/1
      6319540000000000 StoreReg32 dst=R9 src=R1 off=84 imm=0x00000000/0                                *(u32 *) (R9 + 84) /* state->pol_rc */ = R1
      bf61000000000000 Mov64 dst=R1 src=R6 off=0 imm=0x00000000/0
      1812000048000000 LoadImm64 dst=R2 src=R1 off=0 imm=0x00000048/72
      0000000000000000 LoadImm64Pt2 dst=R0 src=R0 off=0 imm=0x00000000/0
      b403000003000000 MovImm32 dst=R3 src=R0 off=0 imm=0x00000003/3
      850000000c000000 Call dst=R0 src=R0 off=0 imm=0x0000000c/12                                      call bpf_tail_call
      b40100000a000000 MovImm32 dst=R1 src=R0 off=0 imm=0x0000000a/10
      6319540000000000 StoreReg32 dst=R9 src=R1 off=84 imm=0x00000000/0                                *(u32 *) (R9 + 84) /* state->pol_rc */ = R1
      b700000002000000 MovImm64 dst=R0 src=R0 off=0 imm=0x00000002/2
      9500000000000000 Exit dst=R0 src=R0 off=0 imm=0x00000000/0
2023-09-22 02:36:54.616 [ERROR][2259] confd/policy_debug.go 78: Failed to dump policy info. error=stat /var/run/calico/bpf/policy/calic440f455693_xdp_v4.json: no such file or directory
# ipset list
Name: cali40all-ipam-pools
Type: hash:net
Revision: 7
Header: family inet hashsize 1024 maxelem 1048576 bucketsize 12 initval 0x1493c672
Size in memory: 504
References: 0
Number of entries: 1
Members:
240.0.0.0/12

Name: cali40masq-ipam-pools
Type: hash:net
Revision: 7
Header: family inet hashsize 1024 maxelem 1048576 bucketsize 12 initval 0xc8f7e12b
Size in memory: 504
References: 0
Number of entries: 1
Members:
240.0.0.0/12

Name: cali40all-vxlan-net
Type: hash:net
Revision: 7
Header: family inet hashsize 1024 maxelem 1048576 bucketsize 12 initval 0x15dc28ab
Size in memory: 696
References: 0
Number of entries: 5
Members:
172.21.147.151
172.21.147.152
172.21.147.153
172.21.147.154
172.21.147.155

Above information are dumped from calico-node pod which is running on Host 172.21.147.156.
And I run a nginx pod on Host 172.21.147.156, too.
Last, my http server is running on Host 172.21.147.156 and listening Port 9999 natively. (not pod)

Could you see anything wrong here?

@tomastigera
Copy link
Contributor

You can dump them as well using ipsets dump

Oops I meant kubectl exec -n calico-system calico-node-XYZ -- calico-node -bpf ipsets dump ipsets list are irelevant iptables related ipsets.

Nevertheless, it does not matter, there is basically no policy, right? 🤔

@Aslan-Liu
Copy link
Author

You can dump them as well using ipsets dump

Oops I meant kubectl exec -n calico-system calico-node-XYZ -- calico-node -bpf ipsets dump ipsets list are irelevant iptables related ipsets.

Nevertheless, it does not matter, there is basically no policy, right? 🤔

Yes, sir. This is a new cluster. I did not add any network policies there.

@tomastigera
Copy link
Contributor

What if you set config option defaultEndpointToHostAction to Accept https://docs.tigera.io/calico/latest/reference/resources/felixconfig Based on the logs, I do not think we hit that, however, let's cover all bases.

@Aslan-Liu
Copy link
Author

Aslan-Liu commented Sep 27, 2023

What if you set config option defaultEndpointToHostAction to Accept https://docs.tigera.io/calico/latest/reference/resources/felixconfig Based on the logs, I do not think we hit that, however, let's cover all bases.

I modified it and it still not works. Here is my Felix config

apiVersion: crd.projectcalico.org/v1
kind: FelixConfiguration
metadata:
  annotations:
    projectcalico.org/metadata: '{"uid":"2af5e1da-dfcf-46d2-94dd-7d2c1015214e","creationTimestamp":"2023-09-20T06:24:14Z"}'
  creationTimestamp: "2023-09-20T06:24:14Z"
  generation: 2
  name: default
  resourceVersion: "27168"
  uid: 3730d464-f3d2-40d5-9044-39a42b151afe
spec:
  bpfEnabled: true
  bpfExternalServiceMode: DSR
  bpfLogLevel: Debug
  defaultEndpointToHostAction: Accept
  floatingIPs: Disabled
  ipipEnabled: false
  logSeverityScreen: Info
  reportingInterval: 0s
  vxlanEnabled: true
  wireguardEnabled: false

@sridhartigera
Copy link
Member

@Aslan-Liu I have a cluster with calico 3.25.2. I created a nginx nodeport service. Further I created a test pod (ubuntu) on one of the nodes and tried to access the service using the nodeIP:nodeport where nodeIP is the IP of the node on which test pod is running. I can see the connectivity working fine. Am I missing something?

@Aslan-Liu
Copy link
Author

Aslan-Liu commented Oct 27, 2023

@Aslan-Liu I have a cluster with calico 3.25.2. I created a nginx nodeport service. Further I created a test pod (ubuntu) on one of the nodes and tried to access the service using the nodeIP:nodeport where nodeIP is the IP of the node on which test pod is running. I can see the connectivity working fine. Am I missing something?

@sridhartigera

  1. You should run nginx on one node natively, not pod. For example, install nginx on one node using 'apt install' and expose this service by HostIP and HostPort. (The node which is installed nginx service will be called as Node-A later.)
  2. Run a pod and specify on Node-A.
  3. In pod, try to connect to ngninx service via Node-A's HostIP:HostPort.

@sridhartigera
Copy link
Member

NAME                          STATUS   ROLES           AGE   VERSION   INTERNAL-IP    EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION    CONTAINER-RUNTIME
sridhar-bz-cvq5-kadm-ms       Ready    control-plane   34m   v1.27.7   10.128.1.87    <none>        Ubuntu 20.04.6 LTS   5.15.0-1045-gcp   containerd://1.6.24
sridhar-bz-cvq5-kadm-node-0   Ready    <none>          32m   v1.27.7   10.128.1.111   <none>        Ubuntu 20.04.6 LTS   5.15.0-1045-gcp   containerd://1.6.24
sridhar-bz-cvq5-kadm-node-1   Ready    <none>          32m   v1.27.7   10.128.1.109   <none>        Ubuntu 20.04.6 LTS   5.15.0-1045-gcp   containerd://1.6.24
sridhar-bz-cvq5-kadm-node-2   Ready    <none>          32m   v1.27.7   10.128.1.110   <none>        Ubuntu 20.04.6 LTS   5.15.0-1045-gcp   containerd://1.6.24

I have nginx installed in sridhar-bz-cvq5-kadm-node-0 listening on port 80. I created a pod in node-0 and did a curl to 10.128.1.111:80 and it works.

NAME    READY   STATUS    RESTARTS   AGE   IP              NODE                          NOMINATED NODE   READINESS GATES
debug   1/1     Running   0          26m   192.168.54.69   sridhar-bz-cvq5-kadm-node-0   <none>           <none>
curl 10.128.1.111:80
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

@Aslan-Liu
Copy link
Author

@sridhartigera Thanks. Yes, the test you did is correct. But the data plan network you used is eBPF? And my OS is Ubuntu 22.04. I am not sure if this may be a problem.

@sridhartigera
Copy link
Member

sridhartigera commented Oct 28, 2023

@Aslan-Liu Yes. Dataplane is eBPF. This is the felix config

kubectl get felixconfiguration -oyaml
apiVersion: v1
items:
- apiVersion: crd.projectcalico.org/v1
  kind: FelixConfiguration
  metadata:
    annotations:
      projectcalico.org/metadata: '{"uid":"9dd6c45f-4950-408a-914b-c922aafe0880","creationTimestamp":"2023-10-27T20:22:26Z"}'
    creationTimestamp: "2023-10-27T20:22:26Z"
    generation: 3
    name: default
    resourceVersion: "1467"
    uid: d1b9423b-f5e5-46ab-a1a6-a6846b52fbf3
  spec:
    bpfEnabled: true
    bpfExternalServiceMode: DSR
    bpfKubeProxyIptablesCleanupEnabled: true
    bpfLogLevel: ""
    floatingIPs: Disabled
    ipipEnabled: true
    logSeverityScreen: Info
    reportingInterval: 0s
    vxlanEnabled: true

I can give it a try with ubuntu 22.04.

@Aslan-Liu
Copy link
Author

Aslan-Liu commented Oct 29, 2023

@Aslan-Liu Yes. Dataplane is eBPF. This is the felix config

kubectl get felixconfiguration -oyaml
apiVersion: v1
items:
- apiVersion: crd.projectcalico.org/v1
  kind: FelixConfiguration
  metadata:
    annotations:
      projectcalico.org/metadata: '{"uid":"9dd6c45f-4950-408a-914b-c922aafe0880","creationTimestamp":"2023-10-27T20:22:26Z"}'
    creationTimestamp: "2023-10-27T20:22:26Z"
    generation: 3
    name: default
    resourceVersion: "1467"
    uid: d1b9423b-f5e5-46ab-a1a6-a6846b52fbf3
  spec:
    bpfEnabled: true
    bpfExternalServiceMode: DSR
    bpfKubeProxyIptablesCleanupEnabled: true
    bpfLogLevel: ""
    floatingIPs: Disabled
    ipipEnabled: true
    logSeverityScreen: Info
    reportingInterval: 0s
    vxlanEnabled: true

I can give it a try with ubuntu 22.04.

@sridhartigera I checked my FelixConfiguration with yours and find two differences. Cloud you help me to check if these differences will cause this issue?

First, my ipipEnabled is false, but yours are true.
Second, your set bpfKubeProxyIptablesCleanupEnabled as true, but I didn't.

Which one may cause this issue?

Here is my FelixConfiguration

apiVersion: crd.projectcalico.org/v1
kind: FelixConfiguration
metadata:
  annotations:
    projectcalico.org/metadata: '{"uid":"2af5e1da-dfcf-46d2-94dd-7d2c1015214e","creationTimestamp":"2023-09-20T06:24:14Z"}'
  creationTimestamp: "2023-09-20T06:24:14Z"
  generation: 2
  name: default
  resourceVersion: "27168"
  uid: 3730d464-f3d2-40d5-9044-39a42b151afe
spec:
  bpfEnabled: true
  bpfExternalServiceMode: DSR
  bpfLogLevel: Debug
  defaultEndpointToHostAction: Accept
  floatingIPs: Disabled
  ipipEnabled: false
  logSeverityScreen: Info
  reportingInterval: 0s
  vxlanEnabled: true
  wireguardEnabled: false

@sridhartigera
Copy link
Member

@Aslan-Liu Can you please try setting bpfKubeProxyIptablesCleanupEnabled: true.

@Aslan-Liu
Copy link
Author

@tomastigera Thanks. At least, I know I have to wait these bugs are fixed. I will keep watching this issue until it is fixed. Thank you.

@uablrek
Copy link
Contributor

uablrek commented Jan 3, 2024

In eBPF mode I can't ping the (any) node IP from within a POD. I can ping other PODs (both ipv4 and ipv6).

Is that caused by the same problem as in this issue?

Calico v1.27.0, install with calico.yaml with "FELIX_BPFENABLED=true"

Update

I disabled kube-proxy when it didn't work, but when I leave kube-proxy running I can ping node IPs.

@tomastigera
Copy link
Contributor

In eBPF mode I can't ping the (any) node IP from within a POD. I can ping other PODs (both ipv4 and ipv6).

Is that caused by the same problem as in this issue?

I can ping nodes from pods, except the local node when the pods is in nat-outgoing pool, which is the same issue as here and fixed by #8380

Update

I disabled kube-proxy when it didn't work, but when I leave kube-proxy running I can ping node IPs.

I do not see how kube-proxy rules would be related unless you ping a service. This likely a different issue.

tomastigera added a commit to tomastigera/project-calico-calico that referenced this issue Jan 4, 2024
…ndpoint

If there is no wildcard HEP, there is no policy that should be applied.
But without skipping, empty list of profiles would create a default deny
rule if none of the non-existent profiles matches. That is obviously
always hit and traffic toward the host is dropped if
defaultEndpointToHostAction is set to RETURN.

fixes projectcalico#7252
@uablrek
Copy link
Contributor

uablrek commented Jan 4, 2024

I do not see how kube-proxy rules would be related unless you ping a service. This likely a different issue.

It's because calico-kube-controllers can't access the api server through the "kubernetes" service in v3.27.0 unless kube-proxy is loaded, but if kube-proxy is loaded, then calico-node PODs never become "ready" (I have a catch-22 situation). I install with calico.yaml in both v3.26 and v3.37 (FELIX_BPFENABLED=true). It works with v3.26.4, but not with v3.37.0. I will check if installation differs between 3.26 and 3.27.

And you can't ping a service, at least not with kube-proxy, since it doesn't forward icmp.

But, you are right, my problem is likely a different issue.

tomastigera added a commit to tomastigera/project-calico-calico that referenced this issue Jan 4, 2024
…ndpoint

If there is no wildcard HEP, there is no policy that should be applied.
But without skipping, empty list of profiles would create a default deny
rule if none of the non-existent profiles matches. That is obviously
always hit and traffic toward the host is dropped if
defaultEndpointToHostAction is set to RETURN.

fixes projectcalico#7252
tomastigera added a commit to tomastigera/project-calico-calico that referenced this issue Jan 11, 2024
…ndpoint

If there is no wildcard HEP, there is no policy that should be applied.
But without skipping, empty list of profiles would create a default deny
rule if none of the non-existent profiles matches. That is obviously
always hit and traffic toward the host is dropped if
defaultEndpointToHostAction is set to RETURN.

fixes projectcalico#7252
@tomastigera tomastigera changed the title Cannot access local node services when using eBPF KubeSpray - Cannot access local node services when using eBPF Jan 16, 2024
tomastigera added a commit to tomastigera/project-calico-calico that referenced this issue Jan 29, 2024
When a pod is accessing a local host, it should not get SNATed as the
host when it is in a nat-outgoing ippool. (a) it is unnecessary as the
local node can be accessed and (b) there is no way to return the traffic
as is it would return to the host itself.

refs projectcalico#7252
tomastigera added a commit to tomastigera/project-calico-calico that referenced this issue Jan 29, 2024
If there is no wildcard HEP, there is no policy that should be applied.
But without skipping, empty list of profiles would create a default deny
rule if none of the non-existent profiles matches. That is obviously
always hit and traffic toward the host is dropped if
defaultEndpointToHostAction is set to RETURN.

fixes projectcalico#7252
@BloodyIron
Copy link

@uablrek did #8388 (v3.27?) fix the matter for you? I'm in a scenario where I cannot remove/disable kube-proxy (rke1 Rancher managed) and I'm pretty sure I'm in the same situation here as this thread. Mostly enough anyways.

@BloodyIron
Copy link

Wait, there isn't a release yet that includes this code change >:| Last release was Dec 15 2023 which is definitely more than 2 weeks ago. How long before this is baked into a release?

@tomastigera
Copy link
Contributor

It will be part of the upcoming 3.27.1 - soon!

@BloodyIron
Copy link

@tomastigera yay!

tomastigera added a commit to tomastigera/project-calico-calico that referenced this issue Jan 31, 2024
When a pod is accessing a local host, it should not get SNATed as the
host when it is in a nat-outgoing ippool. (a) it is unnecessary as the
local node can be accessed and (b) there is no way to return the traffic
as is it would return to the host itself.

refs projectcalico#7252
@tomastigera tomastigera added this to the Calico v3.27.1 milestone Feb 1, 2024
mazdakn pushed a commit to mazdakn/calico that referenced this issue Mar 6, 2024
…ndpoint

If there is no wildcard HEP, there is no policy that should be applied.
But without skipping, empty list of profiles would create a default deny
rule if none of the non-existent profiles matches. That is obviously
always hit and traffic toward the host is dropped if
defaultEndpointToHostAction is set to RETURN.

fixes projectcalico#7252
@tomastigera
Copy link
Contributor

@BloodyIron the fix is released, it is actually 3.27.2

@BloodyIron
Copy link

@tomastigera sorry for the delay in my response, life stuff. I genuinely appreciate you directly tagging me, as on my end my need for eBPF via Calico is particularly important. I'm trying to solve a SourceIP problem, and I'm hoping this does the trick.

Anyways, just letting you know I'm now trying to make the time for this topic, and will aspire to get back to you with my results.

Again, appreciate your help on this ❤️

@BloodyIron
Copy link

@tomastigera due to certain reasons I may need to try addressing my SourceIP need with calico-[kube-controllers|node] v3.22.5 and not the much newer v3.27.2. Namely because I need to upgrade multiple aspects to even reach v3.27.2 capabilities (I THINK, could be wrong), and I'm trying to find a solution that's "good enough" for now to get SourceIP while kube-proxy is active (RKE1) in such a way that I do not see a way to disable kube-proxy fully.

I'm currently going to try and get SourceIP fixed with eBPF with Calico-stuff v3.22.5, and that might be a mistake, but I'm going to find out.

I have a pretty substantial technical debt on my end which makes me very reluctant to address these upgrades that are blocking at this time, but I know I will need to overcome them in the near future. Just wanted to share.

@BloodyIron
Copy link

BloodyIron commented Apr 10, 2024

Okay so the original thing that brought me to this particular GitHub Issue thread is when I try to enable eBPF I get a failure that looks to be related to the original issue in this thread: (this is log output from calico-node pods after enabling eBPF)

2024-04-07 06:13:05.485 [WARNING][93] felix/daemon.go 1149: Config change requires restart key="BPFEnabled" new="true" updateType="add"

2024-04-07 06:13:05.501 [WARNING][93] felix/daemon.go 702: Felix is shutting down reason="config changed"

libbpf: prog 'calico_connect_v4': failed to attach to cgroup: No such file or directory

2024-04-07 06:13:09.024 [PANIC][975] felix/int_dataplane.go 716: BPFConnTimeLBEnabled but failed to attach connect-time load balancer, bailing out. error=failed to attach program calico_connect_v4: failed to attach calico_connect_v4 to cgroup /run/calico/cgroup (legacy try operation not permitted): no such file or directory

panic: (*logrus.Entry) 0xc0009d69b0

And a few days ago I found that maybe there's a solution to this that's been rolled out in Calico v3.23.1 : #6056

So I'm not sure if v3.27.2 is necessarily relevant to my scenario. As such I've been trying to upgrade my calico to v3.23.3, and I'm probably doing something wrong in that process (failing along the way), but that's what I'm working on, and the context of why. Maybe it'll help someone to know.

I'll try to report back if I succeed in the upgrade, and if it solves my SourceIP problem. Appreciate the help thus-far though, thanks! :)

@BloodyIron
Copy link

@BloodyIron the fix is released, it is actually 3.27.2

So in one of my dev k8s clusters I've switched to RKE2... guess which version of Calico it's feeding me? ;P v3.27.2! So I'm likely to try Calico in eBPF mode again, but with RKE2 I can actually properly disable kube-proxy for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment