Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added log collection support for vanilla Kubeadm clusters #211

Merged
merged 2 commits into from
Feb 1, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 9 additions & 2 deletions collection/rancher/v2.x/logs-collector/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,14 @@

## Notes

This script is intended to collect logs from [Rancher Kubernetes Engine (RKE) CLI](https://rancher.com/docs/rke/latest/en/) provisioned clusters, [K3s clusters](https://rancher.com/docs/k3s/latest/en/), [RKE2 clusters](https://docs.rke2.io/), Rancher provisioned [Custom](https://docs.ranchermanager.rancher.io/pages-for-subheaders/use-existing-nodes), and [Node Driver](https://docs.ranchermanager.rancher.io/pages-for-subheaders/use-new-nodes-in-an-infra-provider) clusters.
This script is intended to collect logs from:
- [Rancher Kubernetes Engine (RKE) CLI](https://rancher.com/docs/rke/latest/en/) provisioned clusters
- [K3s clusters](https://rancher.com/docs/k3s/latest/en/)
- [RKE2 clusters](https://docs.rke2.io/)
- Rancher provisioned [Custom](https://docs.ranchermanager.rancher.io/pages-for-subheaders/use-existing-nodes)
- [Node Driver](https://docs.ranchermanager.rancher.io/pages-for-subheaders/use-new-nodes-in-an-infra-provider) clusters
- [Kubeadm](https://kubernetes.io/docs/reference/setup-tools/kubeadm/) clusters has been also recently added.


This script may not collect all necessary information when run on nodes in Hosted [Kubernetes Provider clusters](https://docs.ranchermanager.rancher.io/pages-for-subheaders/set-up-clusters-from-hosted-kubernetes-providers).

Expand Down Expand Up @@ -46,7 +53,7 @@ Rancher 2.x logs-collector
-d Output directory for temporary storage and .tar.gz archive (ex: -d /var/tmp)
-s Start day of journald and docker log collection, # of days relative to the current day (ex: -s 7)
-e End day of journald and docker log collection, # of days relative to the current day (ex: -e 5)
-r Override k8s distribution if not automatically detected (rke|k3s|rke2)
-r Override k8s distribution if not automatically detected (rke|k3s|rke2|kubeadm)
-p When supplied runs with the default nice/ionice priorities, otherwise use the lowest priorities
-f Force log collection if the minimum space isn't available
```
126 changes: 123 additions & 3 deletions collection/rancher/v2.x/logs-collector/rancher2_logs_collector.sh
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -442,7 +442,7 @@ k3s-k8s() {

if [ -d /var/lib/rancher/k3s/server ]; then
unset KUBECONFIG
kubectl api-resources > $TMPDIR/k3s/kubectl/api-resources 2>&1
k3s kubectl api-resources > $TMPDIR/k3s/kubectl/api-resources 2>&1
K3S_OBJECTS=(clusterroles clusterrolebindings crds mutatingwebhookconfigurations namespaces nodes pv validatingwebhookconfigurations)
K3S_OBJECTS_NAMESPACED=(apiservices configmaps cronjobs deployments daemonsets endpoints events helmcharts hpa ingress jobs leases pods pvc replicasets roles rolebindings statefulsets)
for OBJECT in "${K3S_OBJECTS[@]}"; do
Expand Down Expand Up @@ -526,6 +526,68 @@ rke2-k8s() {

}

kubeadm-k8s() {

KUBEADM_DIR="/etc/kubernetes/"
KUBEADM_STATIC_DIR="/etc/kubernetes/manifests/"
if ! $(command -v kubeadm >/dev/null 2>&1); then
echo "error: kubeadm command not found"
exit 1
fi

if ! $(command -v kubectl >/dev/null 2>&1); then
echo "error: kubectl command not found"
exit 1
fi

KUBECONFIG=${KUBECONFIG:"$USER/.kube/config"}
techo "Collecting k8s kubeadm cluster logs"
mkdir -p $TMPDIR/kubeadm/kubectl
kubectl --kubeconfig=$KUBECONFIG get nodes -o wide > $TMPDIR/kubeadm/kubectl/nodes 2>&1
kubectl --kubeconfig=$KUBECONFIG describe nodes > $TMPDIR/kubeadm/kubectl/nodesdescribe 2>&1
kubectl --kubeconfig=$KUBECONFIG version > $TMPDIR/kubeadm/kubectl/version 2>&1
kubectl --kubeconfig=$KUBECONFIG get pods -o wide --all-namespaces > $TMPDIR/kubeadm/kubectl/pods 2>&1
kubectl --kubeconfig=$KUBECONFIG get svc -o wide --all-namespaces > $TMPDIR/kubeadm/kubectl/services 2>&1
kubectl --kubeconfig=$KUBECONFIG cluster-info dump > $TMPDIR/kubeadm/kubectl/cluster-info_dump 2>&1

kubectl --kubeconfig=$KUBECONFIG api-resources > $TMPDIR/kubeadm/kubectl/api-resources 2>&1
KUBEADM_OBJECTS=(clusterroles clusterrolebindings crds mutatingwebhookconfigurations namespaces nodes pv validatingwebhookconfigurations)
KUBEADM_OBJECTS_NAMESPACED=(apiservices configmaps cronjobs deployments daemonsets endpoints events helmcharts hpa ingress jobs leases pods pvc replicasets roles rolebindings statefulsets)
for OBJECT in "${KUBEADM_OBJECTS[@]}"; do
kubectl --kubeconfig=$KUBECONFIG get ${OBJECT} -o wide > $TMPDIR/kubeadm/kubectl/${OBJECT} 2>&1
done
for OBJECT in "${KUBEADM_OBJECTS_NAMESPACED[@]}"; do
kubectl --kubeconfig=$KUBECONFIG get ${OBJECT} --all-namespaces -o wide > $TMPDIR/kubeadm/kubectl/${OBJECT} 2>&1
done

mkdir -p $TMPDIR/kubeadm/podlogs
techo "Collecting k8s kubeadm system pod logs"
for SYSTEM_NAMESPACE in "${SYSTEM_NAMESPACES[@]}"; do
for SYSTEM_POD in $(kubectl --kubeconfig=$KUBECONFIG -n $SYSTEM_NAMESPACE get pods --no-headers -o custom-columns=NAME:.metadata.name); do
kubectl --kubeconfig=$KUBECONFIG -n $SYSTEM_NAMESPACE logs --all-containers $SYSTEM_POD > $TMPDIR/kubeadm/podlogs/$SYSTEM_NAMESPACE-$SYSTEM_POD 2>&1
kubectl --kubeconfig=$KUBECONFIG -n $SYSTEM_NAMESPACE logs -p --all-containers $SYSTEM_POD > $TMPDIR/kubeadm/podlogs/$SYSTEM_NAMESPACE-$SYSTEM_POD-previous 2>&1
done
done
for SYSTEM_NAMESPACE in "${SYSTEM_NAMESPACES[@]}"; do
if ls -d /var/log/pods/$SYSTEM_NAMESPACE* > /dev/null 2>&1; then
cp -r -p /var/log/pods/$SYSTEM_NAMESPACE* $TMPDIR/kubeadm/podlogs/
fi
done

techo "Collecting k8s kubeadm metrics"
kubectl --kubeconfig=$KUBECONFIG top node > $TMPDIR/kubeadm/metrics_pod 2>&1
kubectl --kubeconfig=$KUBECONFIG top pod > $TMPDIR/kubeadm/metrics_nodes 2>&1
kubectl --kubeconfig=$KUBECONFIG top pod --containers=true > $TMPDIR/kubeadm/metrics_containers 2>&1

techo "Collecting k8s kubeadm static pods info and containers logs"
if [ -d /var/log/containers/ ]; then
cp -rp /var/log/containers $TMPDIR/kubeadm/containers-varlogs
fi
if [ -d $KUBEADM_STATIC_DIR ]; then
ls -lah $KUBEADM_STATIC_DIR > $TMPDIR/kubeadm/staticpodlist 2>&1
fi
}

var-log() {

techo "Collecting system logs from /var/log"
Expand Down Expand Up @@ -624,6 +686,36 @@ k3s-certs() {

}

kubeadm-certs() {
if ! $(command -v openssl >/dev/null 2>&1); then
echo "error: openssl command not found"
exit 1
fi

if [ -d /etc/kubernetes/pki/ ]
then
techo "Collecting k8s kubeadm directory state"
mkdir -p $TMPDIR/kubeadm/directories
ls -lah /etc/kubernetes/ > $TMPDIR/kubeadm/directories/kubeadm 2>&1
techo "Collecting k8s kubeadm certificates"
mkdir -p $TMPDIR/kubeadm/pki/{server,kubelet}
SERVER_CERTS=$(find /etc/kubernetes/pki/ -maxdepth 2 -type f -name "*.crt" | grep -v "\-ca.crt$")
for CERT in $SERVER_CERTS
do
openssl x509 -in $CERT -text -noout > $TMPDIR/kubeadm/pki/server/$(basename $CERT) 2>&1
done
if [ -d /var/lib/kubelet/pki/ ]; then
techo "Collecting kubelet certificates"
AGENT_CERTS=$(find /var/lib/kubelet/pki/ -maxdepth 2 -type f -name "*.crt" | grep -v "\-ca.crt$")
for CERT in $AGENT_CERTS
do
openssl x509 -in $CERT -text -noout > $TMPDIR/kubeadm/pki/kubelet/$(basename $CERT) 2>&1
done
fi
fi

}

rke2-certs() {

if [ -d ${RKE2_DIR} ]
Expand Down Expand Up @@ -706,6 +798,29 @@ rke2-etcd() {

}

kubeadm-etcd() {
KUBEADM_ETCD_DIR="/var/lib/etcd/"
KUBEADM_ETCD_CERTS="/etc/kubernetes/pki/etcd/"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add ETCD_CMD checks. This binary might not exist on the host.


if ! $(command -v etcdctl >/dev/null 2>&1); then
echo "error: etcdctl command not found"
exit 1
fi

if [ -d $KUBEADM_ETCD_DIR ]; then
techo "Collecting kubeadm etcd info"
mkdir -p $TMPDIR/etcd
ETCDCTL_ENDPOINTS=$(etcdctl --cert ${KUBEADM_ETCD_CERTS}/server.crt --key ${KUBEADM_ETCD_CERTS}/server.key --cacert ${KUBEADM_ETCD_CERTS}/ca.crt --write-out="simple" endpoint status | cut -d "," -f 1)
etcdctl --endpoints=$ETCDCTL_ENDPOINTS --cert ${KUBEADM_ETCD_CERTS}/server.crt --key ${KUBEADM_ETCD_CERTS}/server.key --cacert ${KUBEADM_ETCD_CERTS}/ca.crt --write-out table endpoint status > $TMPDIR/etcd/endpointstatus 2>&1
etcdctl --endpoints=$ETCDCTL_ENDPOINTS --cert ${KUBEADM_ETCD_CERTS}/server.crt --key ${KUBEADM_ETCD_CERTS}/server.key --cacert ${KUBEADM_ETCD_CERTS}/ca.crt endpoint health > $TMPDIR/etcd/endpointhealth 2>&1
etcdctl --endpoints=$ETCDCTL_ENDPOINTS --cert ${KUBEADM_ETCD_CERTS}/server.crt --key ${KUBEADM_ETCD_CERTS}/server.key --cacert ${KUBEADM_ETCD_CERTS}/ca.crt alarm list > $TMPDIR/etcd/alarmlist 2>&1
fi

if [ -d ${KUBEADM_ETCD_DIR} ]; then
find ${KUBEADM_ETCD_DIR} -type f -exec ls -la {} \; > $TMPDIR/etcd/findserverdbetcd 2>&1
fi
}

timeout_cmd() {

TIMEOUT_EXCEEDED_MSG="$1 command timed out, killing process to prevent hanging."
Expand Down Expand Up @@ -749,7 +864,7 @@ help() {
-e End day of journald and docker log collection. Specify the number of days before the current time (ex: -e 5)
-S Start date of journald and docker log collection. (ex: -S 2022-12-05)
-E End date of journald and docker log collection. (ex: -E 2022-12-07)
-r Override k8s distribution if not automatically detected (rke|k3s|rke2)
-r Override k8s distribution if not automatically detected (rke|k3s|rke2|kubeadm)
-p When supplied runs with the default nice/ionice priorities, otherwise use the lowest priorities
-f Force log collection if the minimum space isn't available"

Expand All @@ -768,7 +883,7 @@ techo() {
}

# Check if we're running as root.
if [[ $EUID -ne 0 ]]
if [[ $EUID -ne 0 ]] && [[ "${DEV}" == "" ]]
then
techo "This script must be run as root"
exit 1
Expand Down Expand Up @@ -870,6 +985,11 @@ elif [ "${DISTRO}" = "rke2" ]
rke2-k8s
rke2-certs
rke2-etcd
elif [ "${DISTRO}" = "kubeadm" ]
then
kubeadm-k8s
kubeadm-certs
kubeadm-etcd
fi
var-log
if [ "${INIT}" = "systemd" ]
Expand Down
Empty file.
Empty file modified collection/rancher/v2.x/systems-information/run.sh
100644 → 100755
Empty file.
Empty file modified collection/rancher/v2.x/systems-information/systems_summary.sh
100644 → 100755
Empty file.