Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rke cert rotate does not work with --ssh-agent-auth: Can not update cluster #1479

Closed
ajfriesen opened this issue Jul 18, 2019 · 4 comments
Closed

Comments

@ajfriesen
Copy link

RKE version:

v0.2.5

Docker version: (docker version,docker info preferred)

docker version
Client:
 Version:           18.09.7
 API version:       1.39
 Go version:        go1.10.8
 Git commit:        2d0083d
 Built:             Thu Jun 27 17:56:17 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          18.09.2
  API version:      1.39 (minimum version 1.12)
  Go version:       go1.10.6
  Git commit:       6247962
  Built:            Sun Feb 10 03:42:13 2019
  OS/Arch:          linux/amd64
  Experimental:     false

docker info

docker info
Containers: 40
 Running: 37
 Paused: 0
 Stopped: 3
Images: 20
Server Version: 18.09.2
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 198
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 894b81a4b802e4eb2a91d1ce216b8817763c29fb
runc version: 425e105d5a03fabd737a126ad93d62a9eeede87f
init version: fec3683
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.4.0-1087-aws
Operating System: Ubuntu 16.04.6 LTS
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 31.42GiB
Name: ip-172-19-5-194
ID: 6AWQ:S5OP:LU3M:RSHU:D65B:P3QM:5UBJ:PRA2:7BAS:ZIXT:TDDN:BX6I
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine

WARNING: No swap limit support

Operating system and kernel: (cat /etc/os-release, uname -r preferred)

uname -r
4.4.0-1087-aws
cat /etc/os-release 
NAME="Ubuntu"
VERSION="16.04.6 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.6 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial

Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO)

AWS

cluster.yml file:

cat rancher-cluster.yml 
nodes:
  - address: IP1
    user: ubuntu
    role: [controlplane,worker,etcd]
  - address: IP2
    user: ubuntu
    role: [controlplane,worker,etcd]
  - address: IP3
    user: ubuntu
    role: [controlplane,worker,etcd]

services:
  etcd:
    snapshot: true
    creation: 6h
    retention: 24h

Steps to Reproduce:

Can not rotate certificate from bastion host with --ssh-agent-auth:

rke_linux-amd64-v0.2.5 cert rotate --config rancher-cluster.yml --ssh-agent-auth
Incorrect Usage.

NAME:
   rke cert rotate - Rotate RKE cluster certificates

USAGE:
   rke cert rotate [command options] [arguments...]

OPTIONS:
   --config value   Specify an alternate cluster YAML file (default: "cluster.yml") [$RKE_CONFIG]
   --service value  Specify a k8s service to rotate certs, (allowed values: kube-apiserver, kube-controller-manager, kube-scheduler, kubelet, kube-proxy, etcd)
   --rotate-ca      Rotate all certificates including CA certs
   
FATA[0000] flag provided but not defined: -ssh-agent-auth

When trying without it is not possible since the key is not present on that machine:

Results:

rke_linux-amd64-v0.2.5 cert rotate --config rancher-cluster.yml
INFO[0000] Initiating Kubernetes cluster                
INFO[0000] Rotating Kubernetes cluster certificates     
INFO[0000] [certificates] Generating Kube Controller certificates 
INFO[0000] [certificates] Generating Kube Proxy certificates 
INFO[0000] [certificates] Generating admin certificates and kubeconfig 
INFO[0000] [certificates] Generating etcd-IP1 certificate and key 
INFO[0000] [certificates] Generating etcd-IP2 certificate and key 
INFO[0001] [certificates] Generating etcd-IP3 certificate and key 
INFO[0001] [certificates] Generating Kubernetes API server certificates 
INFO[0001] [certificates] Generating Kubernetes API server proxy client certificates 
INFO[0001] [certificates] Generating Kube Scheduler certificates 
INFO[0002] [certificates] Generating Node certificate   
INFO[0002] Successfully Deployed state file at [./rancher-cluster.rkestate] 
INFO[0002] Rebuilding Kubernetes cluster with rotated certificates 
INFO[0002] [dialer] Setup tunnel for host [IP2] 
WARN[0002] Failed to set up SSH tunneling for host [IP2]: Can't establish dialer connection: Error while reading SSH key file: open /home/ubuntu/.ssh/id_rsa: no such file or directory 
INFO[0002] [dialer] Setup tunnel for host [IP3] 
WARN[0002] Failed to set up SSH tunneling for host [IP3]: Can't establish dialer connection: Error while reading SSH key file: open /home/ubuntu/.ssh/id_rsa: no such file or directory 
INFO[0002] [dialer] Setup tunnel for host [IP1] 
WARN[0002] Failed to set up SSH tunneling for host [IP1]: Can't establish dialer connection: Error while reading SSH key file: open /home/ubuntu/.ssh/id_rsa: no such file or directory 
WARN[0002] Removing host [IP2] from node lists  
WARN[0002] Removing host [IP3] from node lists 
WARN[0002] Removing host [IP1] from node lists 
FATA[0002] Cluster must have at least one etcd plane host: failed to connect to the following etcd host(s) [IP2]

I do not want to place the key on that machine.
Is there a way to implement this or at least work around this?

@galal-hussein
Copy link
Contributor

@ajfriesen Thanks for opening the issue, right now the cert rotate is missing ssh-agent-auth so there is no workaround without the code change, a manual workaround to be adding the key on the node, rotate, and then remove the key until this issue is fixed

@ajfriesen
Copy link
Author

Thanks, did just that as a workaround.

Still would be great if this will be implemented.

@superseb
Copy link
Contributor

To test:

  • Create 2 Linux machines, one to run rke up and one to create the cluster with.
  • On the machine (machine1) running rke up:
Run 'ssh-keygen' -> default filename -> password rancher
Run 'eval "$(ssh-agent -s)"' to start ssh-agent
Run 'ssh-add ~/.ssh/id_rsa' to add the key to ssh-agent, enter password rancher
create cluster.yml with data to create a cluster with the other machine (either via rke config or manual), ssh key file is ~/.ssh/id_rsa.

Copy the file contents of the public key to place on the other machine to access it:
cat ~/.ssh/id_rsa.pub
  • On the machine being added to the cluster (machine2)
Copy the file contents of the public key as described above and add them to ~/.ssh/authorized_keys

Before continuing, test that you can SSH from machine1 to machine2 using ssh -i ~/.ssh/id_rsa user@machine2, use rancher as password. And make sure you can execute docker ps. If that is validated, run rke up --ssh-agent-auth, it should provision the cluster. Then try to rotate the certificates using rke cert rotate --ssh-agent-auth, this should succeed.

@sowmyav27
Copy link

reproduced the issue with v0.2.5
Steps followed: #1479 (comment)
on rke cert rotate --ssh-agent-auth
result:

Incorrect Usage.

NAME:
   rke cert rotate - Rotate RKE cluster certificates

USAGE:
   rke cert rotate [command options] [arguments...]

OPTIONS:
   --config value   Specify an alternate cluster YAML file (default: "cluster.yml") [$RKE_CONFIG]
   --service value  Specify a k8s service to rotate certs, (allowed values: kube-apiserver, kube-controller-manager, kube-scheduler, kubelet, kube-proxy, etcd)
   --rotate-ca      Rotate all certificates including CA certs
   
FATA[0000] flag provided but not defined: -ssh-agent-auth 

Verified with rke version v0.3.0-rc7
on ./rke cert rotate --ssh-agent-auth
result:

......
INFO[0015] [worker] Restarting Worker Plane..           
INFO[0015] Restarting container [kubelet] on host [x.x.x.x], try #1 
INFO[0016] [restart/kubelet] Successfully restarted container on host [x.x.x.x] 
INFO[0016] Restarting container [kube-proxy] on host [x.x.x.x], try #1 
INFO[0016] [restart/kube-proxy] Successfully restarted container on host [x.x.x.x] 
INFO[0016] [worker] Successfully restarted Worker Plane.. 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants