Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add commands to run commands and check cluster connectivity #1

Merged
merged 3 commits into from
Feb 14, 2022

Conversation

blanquicet
Copy link
Member

@blanquicet blanquicet commented Jan 19, 2022

Add commands to run commands and check cluster connectivity

This PR adds commands run-command and check-apiserver-connectivity with an initial documentation.

How to use

As described in the README, it can be tested by doing:

$ git clone https://github.com/Azure/kubectl-az.git
$ cd kubectl-az
# Build and copy the resulting binary in $HOME/.local/bin/
$ make install

Then, check specific documentation to test each command: docs or the Testing done section below.

Testing done

Tests were executed on Linux with kernel 5.4 and Windows 11:

Testing run-command
# Using node name
$ kubectl az run-command "ip route" --node aks-agentpool-27170680-vmss000000
Running...

[stdout]
default via 10.240.0.1 dev eth0 proto dhcp src 10.240.0.4 metric 100
10.240.0.0/16 dev eth0 proto kernel scope link src 10.240.0.4
10.244.2.6 dev cali0b155bb80e7 scope link
10.244.2.7 dev cali997a02e57a6 scope link
10.244.2.8 dev calia2f1486fcb5 scope link
10.244.2.9 dev cali221544885dd scope link
10.244.2.10 dev cali8913de1b395 scope link
10.244.2.14 dev cali8eecb1f59c6 scope link
10.244.2.30 dev calic04a24d13d1 scope link
10.244.2.31 dev cali5825aa0ebff scope link
10.244.2.32 dev calidc10aa71b63 scope link
10.244.2.33 dev calid407e80bf07 scope link
168.63.129.16 via 10.240.0.1 dev eth0 proto dhcp src 10.240.0.4 metric 100
169.254.169.254 via 10.240.0.1 dev eth0 proto dhcp src 10.240.0.4 metric 100

[stderr]

# Using node name and verbose mode
$ kubectl az run-command "ip route" --node aks-agentpool-27170680-vmss000000 -v
Command: ip route
Virtual Machine Scale Set VM:
{
  "SubscriptionID": "fa899caf-8f31-4531-a2c5-4e23e4bbcd0d",
  "NodeResourceGroup": "mc_jose-rg_josecluster_westeurope",
  "VMScaleSet": "aks-agentpool-27170680-vmss",
  "InstanceID": "0"
}

Running...

Response:
{
  "value": [
    {
      "code": "ProvisioningState/succeeded",
      "displayStatus": "Provisioning succeeded",
      "level": "Info",
      "message": "Enable succeeded: \n[stdout]\ndefault via 10.240.0.1 dev eth0 proto dhcp src 10.240.0.4 metric 100 \n10.240.0.0/16 dev eth0 proto kernel scope link src 10.240.0.4 \n10.244.2.6 dev cali0b155bb80e7 scope link \n10.244.2.7 dev cali997a02e57a6 scope link \n10.244.2.8 dev calia2f1486fcb5 scope link \n10.244.2.9 dev cali221544885dd scope link \n10.244.2.10 dev cali8913de1b395 scope link \n10.244.2.14 dev cali8eecb1f59c6 scope link \n10.244.2.30 dev calic04a24d13d1 scope link \n10.244.2.31 dev cali5825aa0ebff scope link \n10.244.2.32 dev calidc10aa71b63 scope link \n10.244.2.33 dev calid407e80bf07 scope link \n168.63.129.16 via 10.240.0.1 dev eth0 proto dhcp src 10.240.0.4 metric 100 \n169.254.169.254 via 10.240.0.1 dev eth0 proto dhcp src 10.240.0.4 metric 100 \n\n[stderr]\n"
    }
  ]
}

[stdout]
default via 10.240.0.1 dev eth0 proto dhcp src 10.240.0.4 metric 100
10.240.0.0/16 dev eth0 proto kernel scope link src 10.240.0.4
10.244.2.6 dev cali0b155bb80e7 scope link
10.244.2.7 dev cali997a02e57a6 scope link
10.244.2.8 dev calia2f1486fcb5 scope link
10.244.2.9 dev cali221544885dd scope link
10.244.2.10 dev cali8913de1b395 scope link
10.244.2.14 dev cali8eecb1f59c6 scope link
10.244.2.30 dev calic04a24d13d1 scope link
10.244.2.31 dev cali5825aa0ebff scope link
10.244.2.32 dev calidc10aa71b63 scope link
10.244.2.33 dev calid407e80bf07 scope link
168.63.129.16 via 10.240.0.1 dev eth0 proto dhcp src 10.240.0.4 metric 100
169.254.169.254 via 10.240.0.1 dev eth0 proto dhcp src 10.240.0.4 metric 100

[stderr]

# Using VMSS instance information separately
$ kubectl az run-command "ip route" --subscription $SUBSCRIPTION --node-resource-group $NODERESOURCEGROUP --vmss $VMSS --instance-id $INSTANCEID
Running...

[stdout]
default via 10.240.0.1 dev eth0 proto dhcp src 10.240.0.4 metric 100
10.240.0.0/16 dev eth0 proto kernel scope link src 10.240.0.4
10.244.2.6 dev cali0b155bb80e7 scope link
10.244.2.7 dev cali997a02e57a6 scope link
10.244.2.8 dev calia2f1486fcb5 scope link
10.244.2.9 dev cali221544885dd scope link
10.244.2.10 dev cali8913de1b395 scope link
10.244.2.14 dev cali8eecb1f59c6 scope link
10.244.2.30 dev calic04a24d13d1 scope link
10.244.2.31 dev cali5825aa0ebff scope link
10.244.2.32 dev calidc10aa71b63 scope link
10.244.2.33 dev calid407e80bf07 scope link
168.63.129.16 via 10.240.0.1 dev eth0 proto dhcp src 10.240.0.4 metric 100
169.254.169.254 via 10.240.0.1 dev eth0 proto dhcp src 10.240.0.4 metric 100

[stderr]

# Using VMSS instance ID
$ kubectl az run-command "ip route" --id "/subscriptions/$SUBSCRIPTION/resourceGroups/$NODERESOURCEGROUP/providers/Microsoft.Compute/virtualMachineScaleSets/$VMSS/virtualmachines/$INSTANCEID"
Running...

[stdout]
default via 10.240.0.1 dev eth0 proto dhcp src 10.240.0.4 metric 100
10.240.0.0/16 dev eth0 proto kernel scope link src 10.240.0.4
10.244.2.6 dev cali0b155bb80e7 scope link
10.244.2.7 dev cali997a02e57a6 scope link
10.244.2.8 dev calia2f1486fcb5 scope link
10.244.2.9 dev cali221544885dd scope link
10.244.2.10 dev cali8913de1b395 scope link
10.244.2.14 dev cali8eecb1f59c6 scope link
10.244.2.30 dev calic04a24d13d1 scope link
10.244.2.31 dev cali5825aa0ebff scope link
10.244.2.32 dev calidc10aa71b63 scope link
10.244.2.33 dev calid407e80bf07 scope link
168.63.129.16 via 10.240.0.1 dev eth0 proto dhcp src 10.240.0.4 metric 100
169.254.169.254 via 10.240.0.1 dev eth0 proto dhcp src 10.240.0.4 metric 100

[stderr]
Testing check-apiserver-connectivity
# Using node name
$ kubectl az check-apiserver-connectivity --node aks-agentpool-27170680-vmss000000
Running...

Connectivity check: succeeded

# Using node name and verbose mode
$ kubectl az check-apiserver-connectivity --node aks-agentpool-27170680-vmss000000 -v
Command: kubectl --kubeconfig /var/lib/kubelet/kubeconfig version > /dev/null; echo $?
Virtual Machine Scale Set VM:
{
  "SubscriptionID": "fa899caf-8f31-4531-a2c5-4e23e4bbcd0d",
  "NodeResourceGroup": "mc_jose-rg_josecluster_westeurope",
  "VMScaleSet": "aks-agentpool-27170680-vmss",
  "InstanceID": "0"
}

Running...

Response:
{
  "value": [
    {
      "code": "ProvisioningState/succeeded",
      "displayStatus": "Provisioning succeeded",
      "level": "Info",
      "message": "Enable succeeded: \n[stdout]\n0\n\n[stderr]\n"
    }
  ]
}

Connectivity check: succeeded

# Using VMSS instance information separately
$ kubectl az check-apiserver-connectivity --subscription $SUBSCRIPTION --node-resource-group $NODERESOURCEGROUP --vmss $VMSS --instance-id $INSTANCEID
Running...

Connectivity check: succeeded

# Using VMSS instance ID
$ kubectl az check-apiserver-connectivity --id "/subscriptions/$SUBSCRIPTION/resourceGroups/$NODERESOURCEGROUP/providers/Microsoft.Compute/virtualMachineScaleSets/$VMSS/virtualmachines/$INSTANCEID"
Running...

Connectivity check: succeeded

Copy link
Member Author

@blanquicet blanquicet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Starting some threads where it would be great to have people thoughts

cmd/checkConnectivity.go Outdated Show resolved Hide resolved
Comment on lines +79 to +105
if len(res.Value) == 0 || res.Value[0] == nil {
return "", errors.New("no response received after command execution")
}
val := res.Value[0]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to have multiple values after using PollUntilDone()?

Comment on lines 85 to 111
if to.String(val.Code) != "ProvisioningState/succeeded" {
return "", errors.New("command execution didn't succeed")
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there a constant in the SDK to compare this?

Comment on lines +19 to +24
KUBECTL_AZ_TARGETS = \
kubectl-az-linux-amd64 \
kubectl-az-linux-arm64 \
kubectl-az-darwin-amd64 \
kubectl-az-darwin-arm64 \
kubectl-az-windows-amd64
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far, I have tested it only on Linux and Windows

cmd/utils/auth.go Show resolved Hide resolved
cmd/checkConnectivity.go Outdated Show resolved Hide resolved
Copy link
Member

@mauriciovasquezbernal mauriciovasquezbernal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments. I'm testing and it's working as intended.

cmd/root.go Outdated Show resolved Hide resolved
cmd/checkConnectivity.go Outdated Show resolved Hide resolved
cmd/checkConnectivity.go Outdated Show resolved Hide resolved
cmd/checkConnectivity.go Outdated Show resolved Hide resolved
cmd/utils/k8s.go Outdated Show resolved Hide resolved
docs/run-command.md Outdated Show resolved Hide resolved
cmd/checkConnectivity.go Outdated Show resolved Hide resolved
cmd/checkConnectivity.go Outdated Show resolved Hide resolved
docs/check-connectivity.md Outdated Show resolved Hide resolved
cmd/checkConnectivity.go Outdated Show resolved Hide resolved
@mauriciovasquezbernal
Copy link
Member

I think SIGINT needs some special handling. If I interrupt a running command I need to wait some timeout before I'm able to run a command again.

$ ./kubectl-az run-command "ip route" --node aks-agentpool-38928455-vmss000000 -v
Command: ip route shfdsafddow
Virtual Machine Scale Set VM: {
  "SubscriptionID": "fa899caf-8f31-4531-a2c5-4e23e4bbcd0d",
  "NodeResourceGroup": "mc_mauricio-kubeaz-test_mauricio-kubeaz-test_eastus",
  "VMScaleSet": "aks-agentpool-38928455-vmss",
  "InstanceID": "0"
}

Running...
^C
$ ./kubectl-az run-command "ip route" --node aks-agentpool-38928455-vmss000000 -v
Command: ip route shfdsafddow
Virtual Machine Scale Set VM: {
  "SubscriptionID": "fa899caf-8f31-4531-a2c5-4e23e4bbcd0d",
  "NodeResourceGroup": "mc_mauricio-kubeaz-test_mauricio-kubeaz-test_eastus",
  "VMScaleSet": "aks-agentpool-38928455-vmss",
  "InstanceID": "0"
}

Error: failed to run command: couldn't run command: POST https://management.azure.com/subscriptions/fa899caf-8f31-4531-a2c5-4e23e4bbcd0d/resourceGroups/mc_mauricio-kubeaz-test_mauricio-kubeaz-test_eastus/providers/Microsoft.Compute/virtualMachineScaleSets/aks-agentpool-38928455-vmss/virtualmachines/0/runCommand
--------------------------------------------------------------------------------
RESPONSE 409: 409 Conflict
ERROR CODE: Conflict
--------------------------------------------------------------------------------
{
  "error": {
    "code": "Conflict",
    "message": "Run command extension execution is in progress. Please wait for completion before invoking a run command."
  }
}
--------------------------------------------------------------------------------

cmd/checkConnectivity.go Outdated Show resolved Hide resolved
cmd/checkConnectivity.go Outdated Show resolved Hide resolved
cmd/checkConnectivity.go Outdated Show resolved Hide resolved
cmd/checkConnectivity.go Outdated Show resolved Hide resolved
@blanquicet
Copy link
Member Author

I think SIGINT needs some special handling. If I interrupt a running command I need to wait some timeout before I'm able to run a command again.

According to the documentation, we can't cancel a running script. However, BeginRunCommand receives a context, maybe there is something we can do with it. I will create an issue to work on this.

@blanquicet blanquicet force-pushed the jose/initial-run-and-connectivity branch from bfa23a8 to b487698 Compare February 3, 2022 18:58
@blanquicet blanquicet force-pushed the jose/initial-run-and-connectivity branch from b487698 to d6b55ce Compare February 4, 2022 09:54
Copy link
Member

@mauriciovasquezbernal mauriciovasquezbernal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments but LGTM!

cmd/check-apiserver-connectivity.go Show resolved Hide resolved
cmd/root.go Outdated Show resolved Hide resolved
cmd/utils/flags.go Outdated Show resolved Hide resolved
@blanquicet blanquicet force-pushed the jose/initial-run-and-connectivity branch from d6b55ce to 344a4f3 Compare February 14, 2022 17:59
@blanquicet blanquicet merged commit dfc3eaa into main Feb 14, 2022
@blanquicet blanquicet deleted the jose/initial-run-and-connectivity branch February 14, 2022 18:17
@blanquicet
Copy link
Member Author

I think SIGINT needs some special handling. If I interrupt a running command I need to wait some timeout before I'm able to run a command again.

According to the documentation, we can't cancel a running script. However, BeginRunCommand receives a context, maybe there is something we can do with it. I will create an issue to work on this.

I created issue #5 to further investigate this behaviour and try to understand how it can be improved (if possible).

blanquicet added a commit that referenced this pull request Feb 6, 2024
This change is necessary to ensure spinner and suffix are correctly
cleaned up when we call Stop. When we print something between Start and
Stop, the snipper and suggix remains there:

BEFORE THIS PR:

Example #1:

$ ./kubectl-aks config import
/ Importing...WARN[0030] Could not get VMSS VMs via Kubernetes API
WARN[0030] Please provide '--subscription', '--resource-group' and '--cluster-name' flags to get VMSS VMs via Azure API
Error: getting VMSS VMs via Kuberntes API: listing nodes: Get "https://172.28.128.4:6443/api/v1/nodes": dial tcp 172.28.128.4:6443: i/o timeout

Example #2:

$ ./kubectl-aks check-apiserver-connectivity
/ Checking connectivity...Connectivity check: failed with returned value 1

AFTER THIS PR:

Example #1:

$ ./kubectl-aks config import
WARN[0030] Could not get VMSS VMs via Kubernetes API
WARN[0030] Please provide '--subscription', '--resource-group' and '--cluster-name' flags to get VMSS VMs via Azure API
Error: getting VMSS VMs via Kuberntes API: listing nodes: Get "https://172.28.128.4:6443/api/v1/nodes": dial tcp 172.28.128.4:6443: i/o timeout

Example #2:

$ ./kubectl-aks check-apiserver-connectivity
Connectivity check: failed with returned value 1:

Signed-off-by: Jose Blanquicet <josebl@microsoft.com>
blanquicet added a commit that referenced this pull request Feb 7, 2024
This change is necessary to ensure spinner and suffix are correctly
cleaned up when we call Stop. When we print something between Start and
Stop, the snipper and suggix remains there:

BEFORE THIS PR:

Example #1:

$ ./kubectl-aks config import
/ Importing...WARN[0030] Could not get VMSS VMs via Kubernetes API
WARN[0030] Please provide '--subscription', '--resource-group' and '--cluster-name' flags to get VMSS VMs via Azure API
Error: getting VMSS VMs via Kuberntes API: listing nodes: Get "https://172.28.128.4:6443/api/v1/nodes": dial tcp 172.28.128.4:6443: i/o timeout

Example #2:

$ ./kubectl-aks check-apiserver-connectivity
/ Checking connectivity...Connectivity check: failed with returned value 1

AFTER THIS PR:

Example #1:

$ ./kubectl-aks config import
WARN[0030] Could not get VMSS VMs via Kubernetes API
WARN[0030] Please provide '--subscription', '--resource-group' and '--cluster-name' flags to get VMSS VMs via Azure API
Error: getting VMSS VMs via Kuberntes API: listing nodes: Get "https://172.28.128.4:6443/api/v1/nodes": dial tcp 172.28.128.4:6443: i/o timeout

Example #2:

$ ./kubectl-aks check-apiserver-connectivity
Connectivity check: failed with returned value 1:

Signed-off-by: Jose Blanquicet <josebl@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants