Rancher managed RKE2 clusters stuck in "Waiting for probes: kube-controller-manager, kube-scheduler"
This document (000021371) is provided subject to the disclaimer at the end of this document.
Environment
Rancher <2.7.5
RKE2
Situation
Downstream RKE2 clusters in Rancher GUI stuck with status:
"Waiting for probes: kube-controller-manager, kube-scheduler"
Resolution
Stop the RKE2 server:
systemctl stop rke2-server
Remove the .crt and .key file in the respective tls directories:
rm /var/lib/rancher/rke2/server/tls/kube-controller-manager/kube-controller-manager.{crt,key}
and
rm /var/lib/rancher/rke2/server/tls/kube-scheduler/kube-scheduler.{crt,key}
Perform a certificate rotation:
rke2 certificate rotate
Restart RKE2 server:
systemctl start rke2-server
Cause
This issue should be fixed in 2.7.5 as per code changes the certificates that may result in cluster getting stuck on upgrade are now also rotated when using "Rotate Certificates" feature.
It's highly recommended to rotate certificates periodically (at least once a year) to ensure they do not expire.
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.