Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initialization failure: ipamd init: failed to retrieve attached ENIs info (NoCredentialProviders?) #1051

Closed
rayburgemeestre opened this issue Jun 24, 2020 · 3 comments
Labels

Comments

@rayburgemeestre
Copy link

Hi all, I apologize if this is not where this belongs, but I'm in desperate need for a pointer in the right direction 😄

I am deploying Kubernetes manually on EC2 instances. I used to have no issues with my deployments, I just had to make sure I disabled the source/dest check on all the nodes. However I am running into routing issues with that approach so I decided to give this amazon-vpc-cni-k8s project a shot.

I did the changes to kubelet, etc., build the Dockerfile myself and host it somewhere. I'm using version amazon-k8s-cni:v1.6.0-rc1-211-g1af40d2-dirty (dirty because I inserted the strace command in the entrypoint.sh).

However my Pods are not

[root@rb-oscar ~]# kubectl get pod -n kube-system -l k8s-app=aws-node -o wide
NAME             READY   STATUS       RESTARTS   AGE     IP             NODE       NOMINATED NODE   READINESS GATES
aws-node-7bbjw   0/1     Init:Error   2          2m12s   10.0.148.136   cnode001   <none>           <none>
aws-node-99s56   0/1     Init:Error   2          2m13s   10.0.15.174    rb-oscar   <none>           <none>
aws-node-rfmdt   0/1     Init:Error   2          2m13s   10.0.159.37    cnode002   <none>           <none>

There is nothing in the logs, and describe yields:

Events:
  Type     Reason     Age                 From               Message
  ----     ------     ----                ----               -------
  Normal   Scheduled  <unknown>           default-scheduler  Successfully assigned kube-system/aws-node-7bbjw to cnode001
  Normal   Created    12m (x4 over 15m)   kubelet, cnode001  Created container aws-vpc-cni-init
  Normal   Started    12m (x4 over 15m)   kubelet, cnode001  Started container aws-vpc-cni-init
  Normal   Pulling    11m (x5 over 15m)   kubelet, cnode001  Pulling image "rayburgemeestre/amazon-k8s-cni:v1.6.0-rc1-211-g1af40d2-dirty"
  Normal   Pulled     11m (x5 over 15m)   kubelet, cnode001  Successfully pulled image "rayburgemeestre/amazon-k8s-cni:v1.6.0-rc1-211-g1af40d2-dirty"
  Warning  BackOff    16s (x49 over 14m)  kubelet, cnode001  Back-off restarting failed container

I am only able to get some actual output from the docker logs. I'm puzzled where I should find the logs they are not written anywhere on the host, but I managed to get the output by inserting strace:

[root@rb-oscar ~]# docker logs -f 64c604cb63de
Copying CNI plugin binaries ... 
Starting IPAM daemon in the background ... ok.
Checking for IPAM connectivity ...  failed.
Timed out waiting for IPAM daemon to start:
My log:
14    1592998428.123867 write(5, "{\"level\":\"info\",\"ts\":\"2020-06-24T11:33:48.122Z\",\"caller\":\"aws-k8s-agent/main.go:30\",\"msg\":\"Starting L-IPAMD v1.6.0-rc1-211-g1af40d21-dirty  ...\"}\n", 146) = 146
27    1592998428.230343 write(6, "{\"level\":\"info\",\"ts\":\"2020-06-24T11:33:48.229Z\",\"caller\":\"aws-k8s-agent/main.go:42\",\"msg\":\"Testing communication with server\"}\n", 127) = 127
26    1592998428.232205 write(6, "{\"level\":\"info\",\"ts\":\"2020-06-24T11:33:48.232Z\",\"caller\":\"aws-k8s-agent/main.go:42\",\"msg\":\"Successful communication with the Cluster! Cluster Version is: v1.16. git version: v1.16.1. git tree state: clean. commit: d647ddbd755faf07169599a625faf302ffc34458. platform: linux/amd64\"}\n", 280) = 280
14    1592998428.246857 write(7, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.246Z\",\"caller\":\"ipamd/ipamd.go:300\",\"msg\":\"Discovered region: eu-west-1\"}\n", 117) = 117
26    1592998428.252821 write(7, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.252Z\",\"caller\":\"awsutils/awsutils.go:253\",\"msg\":\"Found availability zone: eu-west-1a \"}\n", 131) = 131
14    1592998428.256735 write(7, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.256Z\",\"caller\":\"awsutils/awsutils.go:253\",\"msg\":\"Discovered the instance primary ip address: 10.0.15.174\"}\n", 150) = 150
26    1592998428.260648 write(7, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.260Z\",\"caller\":\"awsutils/awsutils.go:253\",\"msg\":\"Found instance-id: i-0bfffd1709a374844 \"}\n", 134) = 134
26    1592998428.264977 write(7, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.264Z\",\"caller\":\"awsutils/awsutils.go:253\",\"msg\":\"Found instance-type: t2.medium \"}\n", 126) = 126
27    1592998428.270547 write(7, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.270Z\",\"caller\":\"awsutils/awsutils.go:253\",\"msg\":\"Found primary interface's MAC address: 0a:8e:d0:81:33:a6\"}\n", 151) = 151
24    1592998428.290060 write(7, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.289Z\",\"caller\":\"awsutils/awsutils.go:313\",\"msg\":\"Discovered 1 interfaces.\"}\n", 119) = 119
28    1592998428.295910 write(7, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.295Z\",\"caller\":\"awsutils/awsutils.go:313\",\"msg\":\"Found device-number: 0 \"}\n", 118) = 118
28    1592998428.300257 write(7, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.300Z\",\"caller\":\"awsutils/awsutils.go:313\",\"msg\":\"Found account ID: 197943594779\"}\n", 125) = 125
24    1592998428.304553 write(7, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.304Z\",\"caller\":\"awsutils/awsutils.go:313\",\"msg\":\"Found eni: eni-00b7096277a484055 \"}\n", 128) = 128
24    1592998428.304756 write(7, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.304Z\",\"caller\":\"awsutils/awsutils.go:313\",\"msg\":\"eni-00b7096277a484055 is the primary ENI of this instance\"}\n", 152) = 152
24    1592998428.309411 write(7, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.309Z\",\"caller\":\"awsutils/awsutils.go:253\",\"msg\":\"Found security-group id: sg-04df31df22a02bce4\"}\n", 140) = 140
24    1592998428.309678 write(7, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.309Z\",\"caller\":\"awsutils/awsutils.go:253\",\"msg\":\"Found security-group id: sg-0a3bc01a806ecc616\"}\n", 140) = 140
24    1592998428.314162 write(7, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.314Z\",\"caller\":\"awsutils/awsutils.go:253\",\"msg\":\"Found subnet-id: subnet-00a7224d5a15858a5 \"}\n", 137) = 137
24    1592998428.318649 write(7, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.318Z\",\"caller\":\"awsutils/awsutils.go:253\",\"msg\":\"Found vpc-ipv4-cidr-block: 10.0.0.0/16 \"}\n", 134) = 134
28    1592998428.323790 write(7, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.323Z\",\"caller\":\"awsutils/awsutils.go:253\",\"msg\":\"Found VPC CIDR: 10.0.0.0/16\"}\n", 122) = 122
27    1592998428.324693 write(7, "{\"level\":\"info\",\"ts\":\"2020-06-24T11:33:48.324Z\",\"caller\":\"wait/wait.go:133\",\"msg\":\"Will attempt to clean up AWS CNI leaked ENIs after waiting 1m52s.\"}\n", 151) = 151
28    1592998428.326688 write(8, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.325Z\",\"caller\":\"ipamd/ipamd.go:317\",\"msg\":\"Start node init\"}\n", 104) = 104
28    1592998428.327816 write(9, "{\"level\":\"info\",\"ts\":\"2020-06-24T11:33:48.327Z\",\"caller\":\"ipamd/ipamd.go:355\",\"msg\":\"Setting up host network... \"}\n", 115) = 115
28    1592998428.331930 write(9, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.331Z\",\"caller\":\"networkutils/network.go:236\",\"msg\":\"Trying to find primary interface that has mac : 0a:8e:d0:81:33:a6\"}\n", 163) = 163
28    1592998428.334100 write(9, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.333Z\",\"caller\":\"networkutils/network.go:236\",\"msg\":\"Discovered interface: lo, mac: \"}\n", 129) = 129
28    1592998428.334344 write(9, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.334Z\",\"caller\":\"networkutils/network.go:236\",\"msg\":\"Discovered interface: eth0, mac: 0a:8e:d0:81:33:a6\"}\n", 148) = 148
28    1592998428.334561 write(9, "{\"level\":\"info\",\"ts\":\"2020-06-24T11:33:48.334Z\",\"caller\":\"networkutils/network.go:236\",\"msg\":\"Discovered primary interface: eth0\"}\n", 131) = 131
28    1592998428.334833 write(9, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.334Z\",\"caller\":\"ipamd/ipamd.go:355\",\"msg\":\"Setting RPF for primary interface: net/ipv4/conf/eth0/rp_filter\"}\n", 152) = 152
28    1592998428.338816 write(9, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.338Z\",\"caller\":\"networkutils/network.go:263\",\"msg\":\"Found the Link that uses mac address 0a:8e:d0:81:33:a6 and its index is 2 (attempt 1/5)\"}\n", 185) = 185
28    1592998428.374489 write(9, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.374Z\",\"caller\":\"networkutils/network.go:314\",\"msg\":\"Setup Host Network: loading existing iptables nat SNAT exclusion rules\"}\n", 168) = 168
27    1592998428.431939 write(9, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.431Z\",\"caller\":\"ipamd/ipamd.go:355\",\"msg\":\"Setup Host Network: iptables -N AWS-SNAT-CHAIN-0 -t nat\"}\n", 144) = 144
28    1592998428.444497 write(9, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.444Z\",\"caller\":\"ipamd/ipamd.go:355\",\"msg\":\"Setup Host Network: iptables -N AWS-SNAT-CHAIN-1 -t nat\"}\n", 144) = 144
24    1592998428.457642 write(9, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.456Z\",\"caller\":\"ipamd/ipamd.go:355\",\"msg\":\"Setup Host Network: iptables -A POSTROUTING -m comment --comment \\\"AWS SNAT CHAIN\\\" -j AWS-SNAT-CHAIN-0\"}\n", 192) = 192
24    1592998428.458575 write(9, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.458Z\",\"caller\":\"ipamd/ipamd.go:355\",\"msg\":\"Setup Host Network: iptables -A AWS-SNAT-CHAIN-0 ! -d {10.0.0.0/16 %!s(bool=false)} -t nat -j AWS-SNAT-CHAIN-1\"}\n", 199) = 199
24    1592998428.458819 write(9, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.458Z\",\"caller\":\"ipamd/ipamd.go:355\",\"msg\":\"Setup Host Network: synchronising SNAT stale rules\"}\n", 139) = 139
24    1592998428.459088 write(9, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.458Z\",\"caller\":\"ipamd/ipamd.go:355\",\"msg\":\"iptableRules: [nat/POSTROUTING rule first SNAT rules for non-VPC outbound traffic nat/AWS-SNAT-CHAIN-0 rule [0] AWS-SNAT-CHAIN nat/AWS-SNAT-CHAIN-1 rule last SNAT rule for non-VPC outbound traffic]\"}\n", 286) = 286
24    1592998428.459325 write(9, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.459Z\",\"caller\":\"ipamd/ipamd.go:355\",\"msg\":\"execute iptable rule : first SNAT rules for non-VPC outbound traffic\"}\n", 157) = 157
24    1592998428.486569 write(9, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.486Z\",\"caller\":\"ipamd/ipamd.go:355\",\"msg\":\"execute iptable rule : [0] AWS-SNAT-CHAIN\"}\n", 130) = 130
24    1592998428.514525 write(9, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.514Z\",\"caller\":\"ipamd/ipamd.go:355\",\"msg\":\"execute iptable rule : last SNAT rule for non-VPC outbound traffic\"}\n", 155) = 155
24    1592998428.545450 write(9, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.545Z\",\"caller\":\"ipamd/ipamd.go:355\",\"msg\":\"execute iptable rule : connmark for primary ENI\"}\n", 136) = 136
28    1592998428.578966 write(9, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.578Z\",\"caller\":\"ipamd/ipamd.go:355\",\"msg\":\"execute iptable rule : connmark restore for primary ENI\"}\n", 144) = 144
28    1592998428.616110 write(9, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.616Z\",\"caller\":\"ipamd/ipamd.go:355\",\"msg\":\"execute iptable rule : rule for primary address 10.0.15.174\"}\n", 148) = 148
27    1592998428.632646 write(7, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.632Z\",\"caller\":\"awsutils/awsutils.go:948\",\"msg\":\"Total number of interfaces found: 1 \"}\n", 131) = 131
27    1592998428.632736 write(7, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.632Z\",\"caller\":\"awsutils/awsutils.go:443\",\"msg\":\"Found ENI MAC address: 0a:8e:d0:81:33:a6\"}\n", 135) = 135
28    1592998428.639997 write(7, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.639Z\",\"caller\":\"awsutils/awsutils.go:457\",\"msg\":\"Using device number 0 for primary eni: eni-00b7096277a484055\"}\n", 155) = 155
28    1592998428.640100 write(7, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.640Z\",\"caller\":\"awsutils/awsutils.go:443\",\"msg\":\"Found ENI: eni-00b7096277a484055, MAC 0a:8e:d0:81:33:a6, device 0\"}\n", 160) = 160
27    1592998428.645659 write(7, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.645Z\",\"caller\":\"awsutils/awsutils.go:463\",\"msg\":\"Found CIDR 10.0.0.0/17 for ENI 0a:8e:d0:81:33:a6\"}\n", 143) = 143
28    1592998428.648498 write(7, "{\"level\":\"debug\",\"ts\":\"2020-06-24T11:33:48.648Z\",\"caller\":\"awsutils/awsutils.go:463\",\"msg\":\"Found IP addresses [10.0.15.174] on ENI 0a:8e:d0:81:33:a6\"}\n", 152) = 152
28    1592998428.800764 write(7, "{\"level\":\"error\",\"ts\":\"2020-06-24T11:33:48.800Z\",\"caller\":\"ipamd/ipamd.go:360\",\"msg\":\"Failed to call ec2:DescribeNetworkInterfaces for [eni-00b7096277a484055]: NoCredentialProviders: no valid providers in chain. Deprecated.\\n\\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors\"}\n", 296) = 296
28    1592998428.974266 write(7, "{\"level\":\"error\",\"ts\":\"2020-06-24T11:33:48.974Z\",\"caller\":\"ipamd/ipamd.go:360\",\"msg\":\"Failed to call ec2:DescribeNetworkInterfaces for [eni-00b7096277a484055]: NoCredentialProviders: no valid providers in chain. Deprecated.\\n\\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors\"}\n", 296) = 296
24    1592998429.154541 write(7, "{\"level\":\"error\",\"ts\":\"2020-06-24T11:33:49.154Z\",\"caller\":\"ipamd/ipamd.go:360\",\"msg\":\"Failed to call ec2:DescribeNetworkInterfaces for [eni-00b7096277a484055]: NoCredentialProviders: no valid providers in chain. Deprecated.\\n\\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors\"}\n", 296) = 296
28    1592998429.319573 write(7, "{\"level\":\"error\",\"ts\":\"2020-06-24T11:33:49.319Z\",\"caller\":\"ipamd/ipamd.go:360\",\"msg\":\"Failed to call ec2:DescribeNetworkInterfaces for [eni-00b7096277a484055]: NoCredentialProviders: no valid providers in chain. Deprecated.\\n\\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors\"}\n", 296) = 296
28    1592998429.515871 write(7, "{\"level\":\"error\",\"ts\":\"2020-06-24T11:33:49.515Z\",\"caller\":\"ipamd/ipamd.go:360\",\"msg\":\"Failed to call ec2:DescribeNetworkInterfaces for [eni-00b7096277a484055]: NoCredentialProviders: no valid providers in chain. Deprecated.\\n\\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors\"}\n", 296) = 296
26    1592998429.720885 write(7, "{\"level\":\"error\",\"ts\":\"2020-06-24T11:33:49.720Z\",\"caller\":\"ipamd/ipamd.go:360\",\"msg\":\"Failed to call ec2:DescribeNetworkInterfaces for [eni-00b7096277a484055]: NoCredentialProviders: no valid providers in chain. Deprecated.\\n\\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors\"}\n", 296) = 296
24    1592998429.910736 write(7, "{\"level\":\"error\",\"ts\":\"2020-06-24T11:33:49.910Z\",\"caller\":\"ipamd/ipamd.go:360\",\"msg\":\"Failed to call ec2:DescribeNetworkInterfaces for [eni-00b7096277a484055]: NoCredentialProviders: no valid providers in chain. Deprecated.\\n\\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors\"}\n", 296) = 296
14    1592998430.097925 write(7, "{\"level\":\"error\",\"ts\":\"2020-06-24T11:33:50.097Z\",\"caller\":\"ipamd/ipamd.go:360\",\"msg\":\"Failed to call ec2:DescribeNetworkInterfaces for [eni-00b7096277a484055]: NoCredentialProviders: no valid providers in chain. Deprecated.\\n\\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors\"}\n", 296) = 296
26    1592998430.342582 write(7, "{\"level\":\"error\",\"ts\":\"2020-06-24T11:33:50.342Z\",\"caller\":\"ipamd/ipamd.go:360\",\"msg\":\"Failed to call ec2:DescribeNetworkInterfaces for [eni-00b7096277a484055]: NoCredentialProviders: no valid providers in chain. Deprecated.\\n\\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors\"}\n", 296) = 296
28    1592998430.555650 write(7, "{\"level\":\"error\",\"ts\":\"2020-06-24T11:33:50.555Z\",\"caller\":\"ipamd/ipamd.go:360\",\"msg\":\"Failed to call ec2:DescribeNetworkInterfaces for [eni-00b7096277a484055]: NoCredentialProviders: no valid providers in chain. Deprecated.\\n\\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors\"}\n", 296) = 296
26    1592998430.818605 write(7, "{\"level\":\"error\",\"ts\":\"2020-06-24T11:33:50.818Z\",\"caller\":\"ipamd/ipamd.go:360\",\"msg\":\"Failed to call ec2:DescribeNetworkInterfaces for [eni-00b7096277a484055]: NoCredentialProviders: no valid providers in chain. Deprecated.\\n\\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors\"}\n", 296) = 296
27    1592998431.079853 write(7, "{\"level\":\"error\",\"ts\":\"2020-06-24T11:33:51.079Z\",\"caller\":\"ipamd/ipamd.go:360\",\"msg\":\"Failed to call ec2:DescribeNetworkInterfaces for [eni-00b7096277a484055]: NoCredentialProviders: no valid providers in chain. Deprecated.\\n\\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors\"}\n", 296) = 296
27    1592998431.190801 write(5, "{\"level\":\"error\",\"ts\":\"2020-06-24T11:33:51.190Z\",\"caller\":\"aws-k8s-agent/main.go:30\",\"msg\":\"Initialization failure: ipamd init: failed to retrieve attached ENIs info\"}\n", 168) = 168

I feel like I'm pretty close, but it looks like it's using different credentials or something, could that be? How can I set the credentials to use, because I have been able to assign the appropriate IAM permissions AFAIC.

Thanks in advance,
Ray

@mogren
Copy link
Contributor

mogren commented Jun 24, 2020

Hi @rayburgemeestre, what is permissions you have given to the node where the CNI runs? By default in EKS, we give all nodes the managed AmazonEKS_CNI_Policy, and I assume that's what you mean by IAM permissions.

For the credentials, are you using https://github.com/jtblin/kube2iam or something else?

@rayburgemeestre
Copy link
Author

rayburgemeestre commented Jun 26, 2020

Alright @mogren thanks a bunch, I didn't realize you could assign IAM permissions to VM's directly. Your mention of that policy helped us find a solution. I don't have permissions to view or edit IAM stuff in my organization, but someone helped me and followed the instructions at: https://docs.aws.amazon.com/eks/latest/userguide/worker_node_IAM_role.html

This gets me passed the error, now I will have to troubleshoot the next error: backing store restore returned err open /var/run/aws-node/ipam.json: no such file or directory. Saving that for next week since I have to suspend working on this. I will close this question though.

@marcdk
Copy link

marcdk commented Oct 10, 2020

Hi @rayburgemeestre!

Just a quick ask if you had managed to solve the issue with the backing store restore error message?
The error message I'm getting seems related:

Initialization failure: open /var/run/aws-node/ipam.json.tmp008033499: no such file or directory

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants