Skip to content

RKE2 vSphere cluster provisioning failing, with failing kube-apiserver healthchecks due to inability to resolve localhost

This document (000021462) is provided subject to the disclaimer at the end of this document.

Environment

Rancher 2.7.x

Rancher 2.8.x

Situation

Symptoms

  • High number restarts  of the core pods
NAMESPACE      NAME                                           READY   STATUS     RESTARTS
cattle-fleet-system fleet-agent-cc8c97f97-bvx78               1/1    Running      185
cattle-system cattle-cluster-agent-b1460cbd-8ct5c             1/1    Running      115
cattle-system cattle-cluster-agent-b1460cbd-l2l8l             1/1    Running      168
kube-system kube-apiserver-cluster-suse-cp-f777105c-2qgvh     0/1    Running      314
kube-system kube-controller-manager-cluster-suse-cp-5c-2qgvh  1/1    Running      491
kube-system cloud-controller-manager-cluster-suse-cp-5c-2qgvh 1/1    Running      501
  • The apiserver pod flaps between ready and not ready status
NAMESPACE              NAME                                 READY   STATUS     RESTARTS

kube-system kube-apiserver-cluster-suse-cp-f777105c-2qgvh     0/1   Running    314
  • The kubelet logs register failing probes against the kube-apiserver.

Resolution

1) Enable kubelet debug logging

  1. Click ☰ > Cluster Management.

  2. Go to the cluster you want to configure and click ⋮ > Edit Config.

  3. Advanced > Additional Kubelet Args > Add v=9 under For all machines, use Kubelet args

image.png

2) Replicate the livenenessProbe and check the kubelet logs

   2.1 Open an SSH session to a master node
   2.2 Execute the command to simulate livenessProbe for kube-apiserver in the cluster
/var/lib/rancher/rke2/bin/crictl --runtime-endpoint unix:///run/k3s/containerd/containerd.sock exec $(/var/lib/rancher/rke2/bin/crictl --runtime-endpoint unix:///run/k3s/containerd/containerd.sock ps | grep kube-apiserver | awk '{print $1}') kubectl get --server=https://localhost:6443/ --client-certificate=/var/lib/rancher/rke2/server/tls/client-kube-apiserver.crt --client-key=/var/lib/rancher/rke2/server/tls/client-kube-apiserver.key --certificate-authority=/var/lib/rancher/rke2/server/tls/server-ca.crt --raw=/livez
   2.3 Open another SSH sessionn to the same master node and check the logs
tail -f /var/lib/rancher/rke2/agent/logs/kubelet.log | grep kube-apiserver
 Check the DNS resolution and DNS lookups.

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.