Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added log collection support for vanilla Kubeadm clusters #211

Merged
merged 2 commits into from
Feb 1, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion collection/rancher/v2.x/logs-collector/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
## Notes

This script is intended to collect logs from [Rancher Kubernetes Engine (RKE) CLI](https://rancher.com/docs/rke/latest/en/) provisioned clusters, [K3s clusters](https://rancher.com/docs/k3s/latest/en/), [RKE2 clusters](https://docs.rke2.io/), Rancher provisioned [Custom](https://docs.ranchermanager.rancher.io/pages-for-subheaders/use-existing-nodes), and [Node Driver](https://docs.ranchermanager.rancher.io/pages-for-subheaders/use-new-nodes-in-an-infra-provider) clusters.
A fallback log collection support from vanilla Kubeadm clusters has been also recently added.


This script may not collect all necessary information when run on nodes in Hosted [Kubernetes Provider clusters](https://docs.ranchermanager.rancher.io/pages-for-subheaders/set-up-clusters-from-hosted-kubernetes-providers).

Expand Down Expand Up @@ -46,7 +48,7 @@ Rancher 2.x logs-collector
-d Output directory for temporary storage and .tar.gz archive (ex: -d /var/tmp)
-s Start day of journald and docker log collection, # of days relative to the current day (ex: -s 7)
-e End day of journald and docker log collection, # of days relative to the current day (ex: -e 5)
-r Override k8s distribution if not automatically detected (rke|k3s|rke2)
-r Override k8s distribution if not automatically detected (rke|k3s|rke2|kubeadm)
-p When supplied runs with the default nice/ionice priorities, otherwise use the lowest priorities
-f Force log collection if the minimum space isn't available
```
126 changes: 125 additions & 1 deletion collection/rancher/v2.x/logs-collector/rancher2_logs_collector.sh
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,16 @@ sherlock() {
FOUND="rke"
fi
fi
if $(command -v kubeadm >/dev/null 2>&1)
then
if $(kubeadm version >/dev/null 2>&1)
then
DISTRO=kubeadm
echo "kubeadm"
else
FOUND+="kubeadm"
fi
fi
if [ -z ${DISTRO} ]
then
echo -e "\n$(timestamp): couldn't detect k8s distro"
Expand Down Expand Up @@ -526,6 +536,70 @@ rke2-k8s() {

}

kubeadm-k8s() {

KUBEADM_DIR="/etc/kubernetes/"
KUBEADM_STATIC_DIR="/etc/kubernetes/manifests/"
techo "Collecting k8s kubeadm cluster logs"
if [ -f /$USER/.kube/config ]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible that a user doesn't store the kubeconfig in this location. Can we make this configurable?

KUBECONFIG=${KUBECONFIG:-"$USER/.kube/config"}

mkdir -p $TMPDIR/kubeadm/kubectl
KUBECONFIG=/$USER/.kube/config
/usr/bin/kubectl --kubeconfig=$KUBECONFIG get nodes -o wide > $TMPDIR/kubeadm/kubectl/nodes 2>&1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make kubectl binary location also configurable.

KUBECTL_CMD=${KUBECTL_CMD:-"/usr/bin/kubectl"}

/usr/bin/kubectl --kubeconfig=$KUBECONFIG describe nodes > $TMPDIR/kubeadm/kubectl/nodesdescribe 2>&1
/usr/bin/kubectl --kubeconfig=$KUBECONFIG version > $TMPDIR/kubeadm/kubectl/version 2>&1
/usr/bin/kubectl --kubeconfig=$KUBECONFIG get pods -o wide --all-namespaces > $TMPDIR/kubeadm/kubectl/pods 2>&1
/usr/bin/kubectl --kubeconfig=$KUBECONFIG get svc -o wide --all-namespaces > $TMPDIR/kubeadm/kubectl/services 2>&1
/usr/bin/kubectl --kubeconfig=$KUBECONFIG cluster-info dump > $TMPDIR/kubeadm/kubectl/cluster-info_dump 2>&1
fi

if [ -f /$USER/.kube/config ]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious, why do we have two "if" blocks?

KUBECONFIG=/$USER/.kube/config
/usr/bin/kubectl --kubeconfig=$KUBECONFIG api-resources > $TMPDIR/kubeadm/kubectl/api-resources 2>&1
KUBEADM_OBJECTS=(clusterroles clusterrolebindings crds mutatingwebhookconfigurations namespaces nodes pv validatingwebhookconfigurations)
KUBEADM_OBJECTS_NAMESPACED=(apiservices configmaps cronjobs deployments daemonsets endpoints events helmcharts hpa ingress jobs leases pods pvc replicasets roles rolebindings statefulsets)
for OBJECT in "${KUBEADM_OBJECTS[@]}"; do
/usr/bin/kubectl --kubeconfig=$KUBECONFIG get ${OBJECT} -o wide > $TMPDIR/kubeadm/kubectl/${OBJECT} 2>&1
done
for OBJECT in "${KUBEADM_OBJECTS_NAMESPACED[@]}"; do
/usr/bin/kubectl --kubeconfig=$KUBECONFIG get ${OBJECT} --all-namespaces -o wide > $TMPDIR/kubeadm/kubectl/${OBJECT} 2>&1
done
fi

mkdir -p $TMPDIR/kubeadm/podlogs
techo "Collecting k8s kubeadm system pod logs"
if [ -f /$USER/.kube/config ]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these if checks can be done once at the top of the function, not needed for every section.

KUBECONFIG=/$USER/.kube/config
for SYSTEM_NAMESPACE in "${SYSTEM_NAMESPACES[@]}"; do
for SYSTEM_POD in $(/usr/bin/kubectl --kubeconfig=$KUBECONFIG -n $SYSTEM_NAMESPACE get pods --no-headers -o custom-columns=NAME:.metadata.name); do
/usr/bin/kubectl --kubeconfig=$KUBECONFIG -n $SYSTEM_NAMESPACE logs --all-containers $SYSTEM_POD > $TMPDIR/kubeadm/podlogs/$SYSTEM_NAMESPACE-$SYSTEM_POD 2>&1
/usr/bin/kubectl --kubeconfig=$KUBECONFIG -n $SYSTEM_NAMESPACE logs -p --all-containers $SYSTEM_POD > $TMPDIR/kubeadm/podlogs/$SYSTEM_NAMESPACE-$SYSTEM_POD-previous 2>&1
done
done
for SYSTEM_NAMESPACE in "${SYSTEM_NAMESPACES[@]}"; do
if ls -d /var/log/pods/$SYSTEM_NAMESPACE* > /dev/null 2>&1; then
cp -r -p /var/log/pods/$SYSTEM_NAMESPACE* $TMPDIR/kubeadm/podlogs/
fi
done
fi

techo "Collecting k8s kubeadm metrics"
if [ -f /$USER/.kube/config ]; then
KUBECONFIG=/$USER/.kube/config
/usr/bin/kubectl --kubeconfig=$KUBECONFIG top node > $TMPDIR/kubeadm/metrics_pod 2>&1
/usr/bin/kubectl --kubeconfig=$KUBECONFIG top pod > $TMPDIR/kubeadm/metrics_nodes 2>&1
/usr/bin/kubectl --kubeconfig=$KUBECONFIG top pod --containers=true > $TMPDIR/kubeadm/metrics_containers 2>&1
fi

techo "Collecting k8s kubeadm static pods info and containers logs"
if [ -d /var/log/containers/ ]; then
cp -rp /var/log/containers $TMPDIR/kubeadm/containers-varlogs
fi
if [ -d $KUBEADM_STATIC_DIR ]; then
ls -lah $KUBEADM_STATIC_DIR > $TMPDIR/kubeadm/staticpodlist 2>&1
fi

}

var-log() {

techo "Collecting system logs from /var/log"
Expand Down Expand Up @@ -624,6 +698,32 @@ k3s-certs() {

}

kubeadm-certs() {

if [ -d /etc/kubernetes/pki/ ]
then
techo "Collecting k8s kubeadm directory state"
mkdir -p $TMPDIR/kubeadm/directories
ls -lah /etc/kubernetes/ > $TMPDIR/kubeadm/directories/kubeadm 2>&1
techo "Collecting k8s kubeadm certificates"
mkdir -p $TMPDIR/kubeadm/pki/{server,kubelet}
SERVER_CERTS=$(find /etc/kubernetes/pki/ -maxdepth 2 -type f -name "*.crt" | grep -v "\-ca.crt$")
for CERT in $SERVER_CERTS
do
openssl x509 -in $CERT -text -noout > $TMPDIR/kubeadm/pki/server/$(basename $CERT) 2>&1
done
if [ -d /var/lib/kubelet/pki/ ]; then
techo "Collecting kubelet certificates"
AGENT_CERTS=$(find /var/lib/kubelet/pki/ -maxdepth 2 -type f -name "*.crt" | grep -v "\-ca.crt$")
for CERT in $AGENT_CERTS
do
openssl x509 -in $CERT -text -noout > $TMPDIR/kubeadm/pki/kubelet/$(basename $CERT) 2>&1
done
fi
fi

}

rke2-certs() {

if [ -d ${RKE2_DIR} ]
Expand Down Expand Up @@ -706,6 +806,25 @@ rke2-etcd() {

}

kubeadm-etcd() {

KUBEADM_ETCD_DIR="/var/lib/etcd/"
KUBEADM_ETCD_CERTS="/etc/kubernetes/pki/etcd/"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add ETCD_CMD checks. This binary might not exist on the host.


if [ -d $KUBEADM_ETCD_DIR ]; then
techo "Collecting kubeadm etcd info"
mkdir -p $TMPDIR/etcd
ETCDCTL_ENDPOINTS=$(/usr/bin/etcdctl --cert ${KUBEADM_ETCD_CERTS}/server.crt --key ${KUBEADM_ETCD_CERTS}/server.key --cacert ${KUBEADM_ETCD_CERTS}/ca.crt --write-out="simple" endpoint status | cut -d "," -f 1)
/usr/bin/etcdctl --endpoints=$ETCDCTL_ENDPOINTS --cert ${KUBEADM_ETCD_CERTS}/server.crt --key ${KUBEADM_ETCD_CERTS}/server.key --cacert ${KUBEADM_ETCD_CERTS}/ca.crt --write-out table endpoint status > $TMPDIR/etcd/endpointstatus 2>&1
/usr/bin/etcdctl --endpoints=$ETCDCTL_ENDPOINTS --cert ${KUBEADM_ETCD_CERTS}/server.crt --key ${KUBEADM_ETCD_CERTS}/server.key --cacert ${KUBEADM_ETCD_CERTS}/ca.crt endpoint health > $TMPDIR/etcd/endpointhealth 2>&1
/usr/bin/etcdctl --endpoints=$ETCDCTL_ENDPOINTS --cert ${KUBEADM_ETCD_CERTS}/server.crt --key ${KUBEADM_ETCD_CERTS}/server.key --cacert ${KUBEADM_ETCD_CERTS}/ca.crt alarm list > $TMPDIR/etcd/alarmlist 2>&1
fi

if [ -d ${KUBEADM_ETCD_DIR} ]; then
find ${KUBEADM_ETCD_DIR} -type f -exec ls -la {} \; > $TMPDIR/etcd/findserverdbetcd 2>&1
fi
}

timeout_cmd() {

TIMEOUT_EXCEEDED_MSG="$1 command timed out, killing process to prevent hanging."
Expand Down Expand Up @@ -749,7 +868,7 @@ help() {
-e End day of journald and docker log collection. Specify the number of days before the current time (ex: -e 5)
-S Start date of journald and docker log collection. (ex: -S 2022-12-05)
-E End date of journald and docker log collection. (ex: -E 2022-12-07)
-r Override k8s distribution if not automatically detected (rke|k3s|rke2)
-r Override k8s distribution if not automatically detected (rke|k3s|rke2|kubeadm)
-p When supplied runs with the default nice/ionice priorities, otherwise use the lowest priorities
-f Force log collection if the minimum space isn't available"

Expand Down Expand Up @@ -870,6 +989,11 @@ elif [ "${DISTRO}" = "rke2" ]
rke2-k8s
rke2-certs
rke2-etcd
elif [ "${DISTRO}" = "kubeadm" ]
then
kubeadm-k8s
kubeadm-certs
kubeadm-etcd
fi
var-log
if [ "${INIT}" = "systemd" ]
Expand Down