Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing route in VPC peering #495

Closed
dmai-apixio opened this issue Jun 3, 2019 · 4 comments
Closed

Missing route in VPC peering #495

dmai-apixio opened this issue Jun 3, 2019 · 4 comments
Labels

Comments

@dmai-apixio
Copy link

I have 2 VPCs and they are connected using VPC peering

vpc-1 10.0.0.0/16
vpc-2 10.1.0.0/16

I setup EKS cluster with 3 workers running on vpc-2. Security Groups are open for all.
Running a simple deployment to deploy nginx and expose port 80

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 12
  selector:
    matchLabels:
      app: nginx
  template:
      labels:
        app: nginx
      containers:
      - name: nginx
        image: nginx:1.15.4
        ports:
          - name: http
            containerPort: 80

And here is pod information

NAME                                READY   STATUS    RESTARTS   AGE     IP             NODE                                         NOMINATED NODE
nginx-deployment-6c479b78c5-7bztf   1/1     Running   0          2d18h   10.1.133.30    ip-10-1-134-238.us-west-2.compute.internal   <none>
nginx-deployment-6c479b78c5-88mfk   1/1     Running   0          2d18h   10.1.128.223   ip-10-1-129-172.us-west-2.compute.internal   <none>
nginx-deployment-6c479b78c5-9qp9z   1/1     Running   0          2d18h   10.1.131.223   ip-10-1-129-172.us-west-2.compute.internal   <none>
nginx-deployment-6c479b78c5-d6nhq   1/1     Running   0          2d18h   10.1.130.219   ip-10-1-129-172.us-west-2.compute.internal   <none>
nginx-deployment-6c479b78c5-g69tr   1/1     Running   0          2d18h   10.1.139.180   ip-10-1-137-77.us-west-2.compute.internal    <none>
nginx-deployment-6c479b78c5-ghnq2   1/1     Running   0          2d18h   10.1.135.210   ip-10-1-134-238.us-west-2.compute.internal   <none>
nginx-deployment-6c479b78c5-hd5cr   1/1     Running   0          2d18h   10.1.131.102   ip-10-1-129-172.us-west-2.compute.internal   <none>
nginx-deployment-6c479b78c5-jfxr8   1/1     Running   0          2d18h   10.1.132.160   ip-10-1-134-238.us-west-2.compute.internal   <none>
nginx-deployment-6c479b78c5-m65hn   1/1     Running   0          2d18h   10.1.139.120   ip-10-1-137-77.us-west-2.compute.internal    <none>
nginx-deployment-6c479b78c5-qv68l   1/1     Running   0          2d18h   10.1.136.100   ip-10-1-137-77.us-west-2.compute.internal    <none>
nginx-deployment-6c479b78c5-t2xv6   1/1     Running   0          2d18h   10.1.138.6     ip-10-1-137-77.us-west-2.compute.internal    <none>
nginx-deployment-6c479b78c5-zrfjh   1/1     Running   0          2d18h   10.1.133.82    ip-10-1-134-238.us-west-2.compute.internal   <none>

From any server in the vpc-2 (the same vpc running EKS), i can connect to port 80 of these pods. But in any server on vpc-1, i can only hit 9 of 12 nginx pods.

kubectl get po -o wide | grep nginx | awk -F ' ' '{print $6}' | xargs -I{} bash -c 'echo -n {}; curl -s -o /dev/null -w " %{http_code}\n" -m 1 {}'
10.1.133.30 200
10.1.128.223 200
10.1.131.223 000
10.1.130.219 200
10.1.139.180 200
10.1.135.210 000
10.1.131.102 200
10.1.132.160 200
10.1.139.120 200
10.1.136.100 200
10.1.138.6 000
10.1.133.82 200

As you can see the result above. i can not hit 10.1.131.223, 10.1.135.210, 10.1.138.6. I ran command /opt/cni/bin/aws-cni-support.sh on each worker and found three of them have one thing in common. These IPs belonged to the secondary interface (not a main one - eth0).

# pod.output on ip-10-1-134-238.us-west-2.compute.internal
  "nginx-deployment-6c479b78c5-ghnq2_testing_96f65f1f5e79a84afeacdf2f45c329cb38c847138fc3b2db4566ec48e18c0c42": {
    "IP": "10.1.135.210",
    "DeviceNumber": 2
  },
# pod.output on ip-10-1-137-77.us-west-2.compute.internal
  "nginx-deployment-6c479b78c5-t2xv6_testing_eeb39b4d85f9f7fdece2613e27d661127c394030c0b760959768cb4b749e7a0e": {
    "IP": "10.1.138.6",
    "DeviceNumber": 2
  }
pod.output on ip-10-1-129-172.us-west-2.compute.internal
  "nginx-deployment-6c479b78c5-9qp9z_testing_eb35b8a44be06854517d834cdd459cea98a66f17c28317e185fa82bccb09ccd2": {
    "IP": "10.1.131.223",
    "DeviceNumber": 2
  },

I pick up pod with IP 10.1.135.210 running on worker ip-10-1-134-238.us-west-2.compute.internal to continue checking route table

[ec2-user@ip-10-1-134-238 ~]$ ip rule show
0:	from all lookup local
512:	from all to 10.1.132.11 lookup main
512:	from all to 10.1.134.230 lookup main
512:	from all to 10.1.133.129 lookup main
512:	from all to 10.1.135.193 lookup main
512:	from all to 10.1.135.47 lookup main
512:	from all to 10.1.133.95 lookup main
512:	from all to 10.1.133.82 lookup main
512:	from all to 10.1.135.210 lookup main
512:	from all to 10.1.132.160 lookup main
512:	from all to 10.1.133.30 lookup main
1024:	from all fwmark 0x80/0x80 lookup main
1536:	from 10.1.135.210 to 10.1.0.0/16 lookup 2
32766:	from all lookup main
32767:	from all lookup default
[ec2-user@ip-10-1-134-238 ~]$ ip route show table main
default via 10.1.132.1 dev eth0
10.1.132.0/22 dev eth0 proto kernel scope link src 10.1.134.238
10.1.132.11 dev eni3367e1e163b scope link
10.1.132.160 dev eni79af95bd2b2 scope link
10.1.133.30 dev eni7976e813d36 scope link
10.1.133.82 dev enia1d23be782e scope link
10.1.133.95 dev enid5a84d7679a scope link
10.1.133.129 dev eni4ab52ffff90 scope link
10.1.134.230 dev eni78b12ef97b2 scope link
10.1.135.47 dev eni10d2e7ffdce scope link
10.1.135.193 dev eni242a6cbb667 scope link
10.1.135.210 dev eni1dc02e51912 scope link
169.254.169.254 dev eth0

I found there is no route to vpc-1 (10.0.0.0/16) on woker node. Adding new rule and it worked perfectly.

[root@ip-10-1-134-238 ec2-user]# ip rule add from 10.1.135.210 to 10.0.0.0/16 priority 1537 table 2

My question is should vpc-cni-k8s plugin add routing automatically when see vpc peering? If i have multiple VPCs would like to talk to EKS, i need to add routing manually on every worker node. That's not good.

@dmai-apixio dmai-apixio changed the title Missing route in multi VPCs environment Missing route in VPC peering Jun 3, 2019
@sethp-nr
Copy link

sethp-nr commented Jun 3, 2019

My limited understanding is that VPC peers should be handled by the default route from the machine's perspective: do the two VPCs that you've paired have entries for the peering connection in their route tables?

@mogren mogren added the question label Jun 4, 2019
@dmai-apixio
Copy link
Author

@sethp-nr i think i have a correct route table in peering connection. I can reach 9 over 12 pods in vpc-2 from vpc-1.

@ewbankkit
Copy link
Contributor

@dmai-apixio What is your AWS_VPC_K8S_CNI_EXTERNALSNAT value?

@mogren
Copy link
Contributor

mogren commented Sep 28, 2019

Either enable AWS_VPC_K8S_CNI_EXTERNALSNAT, or use AWS_VPC_K8S_CNI_EXCLUDE_SNAT_CIDRS added in #520 and available in v1.6.0-rc2 or later.

@mogren mogren closed this as completed Sep 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants