How to replace a control plane node in a Rancher local RKE2 cluster
Article Number: 000022214
Environment
- Rancher v2.5+ running on an RKE2 cluster
Procedure
- First, make sure to prepare a node with a supported operating system (OS) as mentioned in the documentation. Best practice is that the OS is aligned (the same distribution and version) with the other cluster nodes and in compliance with the SUSE Rancher Support Matrix.
- Take an etcd backup of the Rancher local cluster using the steps mentioned here.
- You can additionally take a backup of Rancher state using the Rancher Backup Operator.
- Follow the steps to register the new node as an additional control plane node in the local cluster.
-
Verify that etcd also shows the new node as a member by running the command below on the new node:
06. Run the command below on any existing node to ensure the node status isexport CRI_CONFIG_FILE=/var/lib/rancher/rke2/agent/etc/crictl.yaml etcdcontainer=$(/var/lib/rancher/rke2/bin/crictl ps --label io.kubernetes.container.name=etcd --quiet) /var/lib/rancher/rke2/bin/crictl exec $etcdcontainer etcdctl --cert /var/lib/rancher/rke2/server/tls/etcd/server-client.crt --key /var/lib/rancher/rke2/server/tls/etcd/server-client.key --cacert /var/lib/rancher/rke2/server/tls/etcd/server-ca.crt endpoint health --cluster --write-out=tableReady:07. Now, open an SSH session to the node to be removed. Stop and disable the rke2-server service:kubectl get nodes08. In the same SSH session, on the node to be removed, run thesystemctl stop rke2-server systemctl disable rke2-serverrke2-killall.shcleanup script, to terminate all Pod and RKE2 related processes:09. Delete the node from the Kubernetes cluster by running the command below on any other control plane node in the cluster:rke2-killall.sh10. Verify that the node is deleted:kubectl delete node <NODENAME>kubectl get nodes