Skip to content

How to replace a control plane node in a Rancher local RKE2 cluster

Article Number: 000022214

Environment

  • Rancher v2.5+ running on an RKE2 cluster

Procedure

  1. First, make sure to prepare a node with a supported operating system (OS) as mentioned in the documentation. Best practice is that the OS is aligned (the same distribution and version) with the other cluster nodes and in compliance with the SUSE Rancher Support Matrix.
  2. Take an etcd backup of the Rancher local cluster using the steps mentioned here.
  3. You can additionally take a backup of Rancher state using the Rancher Backup Operator.
  4. Follow the steps to register the new node as an additional control plane node in the local cluster.
  5. Verify that etcd also shows the new node as a member by running the command below on the new node:

    export CRI_CONFIG_FILE=/var/lib/rancher/rke2/agent/etc/crictl.yaml
    etcdcontainer=$(/var/lib/rancher/rke2/bin/crictl ps --label io.kubernetes.container.name=etcd --quiet)
    /var/lib/rancher/rke2/bin/crictl exec $etcdcontainer etcdctl --cert /var/lib/rancher/rke2/server/tls/etcd/server-client.crt --key /var/lib/rancher/rke2/server/tls/etcd/server-client.key --cacert /var/lib/rancher/rke2/server/tls/etcd/server-ca.crt endpoint health --cluster --write-out=table
    
    06. Run the command below on any existing node to ensure the node status is Ready:

    kubectl get nodes
    
    07. Now, open an SSH session to the node to be removed. Stop and disable the rke2-server service:

    systemctl stop rke2-server
    systemctl disable rke2-server
    
    08. In the same SSH session, on the node to be removed, run the rke2-killall.sh cleanup script, to terminate all Pod and RKE2 related processes:

    rke2-killall.sh
    
    09. Delete the node from the Kubernetes cluster by running the command below on any other control plane node in the cluster:

    kubectl delete node <NODENAME>
    
    10. Verify that the node is deleted:

    kubectl get nodes