RKE2 vSphere cluster provisioning failing, with failing kube-apiserver healthchecks due to inability to resolve localhost
This document (000021462) is provided subject to the disclaimer at the end of this document.
Environment
Rancher 2.7.x
Rancher 2.8.x
Situation
Symptoms
- High number restarts of the core pods
NAMESPACE NAME READY STATUS RESTARTS
cattle-fleet-system fleet-agent-cc8c97f97-bvx78 1/1 Running 185
cattle-system cattle-cluster-agent-b1460cbd-8ct5c 1/1 Running 115
cattle-system cattle-cluster-agent-b1460cbd-l2l8l 1/1 Running 168
kube-system kube-apiserver-cluster-suse-cp-f777105c-2qgvh 0/1 Running 314
kube-system kube-controller-manager-cluster-suse-cp-5c-2qgvh 1/1 Running 491
kube-system cloud-controller-manager-cluster-suse-cp-5c-2qgvh 1/1 Running 501
- The apiserver pod flaps between ready and not ready status
NAMESPACE NAME READY STATUS RESTARTS
kube-system kube-apiserver-cluster-suse-cp-f777105c-2qgvh 0/1 Running 314
- The kubelet logs register failing probes against the kube-apiserver.
Resolution
1) Enable kubelet debug logging
-
Click ☰ > Cluster Management.
-
Go to the cluster you want to configure and click ⋮ > Edit Config.
-
Advanced > Additional Kubelet Args > Add v=9 under For all machines, use Kubelet args
2) Replicate the livenenessProbe and check the kubelet logs
2.1 Open an SSH session to a master node
2.2 Execute the command to simulate livenessProbe for kube-apiserver in the cluster
/var/lib/rancher/rke2/bin/crictl --runtime-endpoint unix:///run/k3s/containerd/containerd.sock exec $(/var/lib/rancher/rke2/bin/crictl --runtime-endpoint unix:///run/k3s/containerd/containerd.sock ps | grep kube-apiserver | awk '{print $1}') kubectl get --server=https://localhost:6443/ --client-certificate=/var/lib/rancher/rke2/server/tls/client-kube-apiserver.crt --client-key=/var/lib/rancher/rke2/server/tls/client-kube-apiserver.key --certificate-authority=/var/lib/rancher/rke2/server/tls/server-ca.crt --raw=/livez
2.3 Open another SSH sessionn to the same master node and check the logs
tail -f /var/lib/rancher/rke2/agent/logs/kubelet.log | grep kube-apiserver
Check the DNS resolution and DNS lookups.
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.