How to re-add a Master node to the RKE2 HA cluster once its removed from cluster.
This document (000020821) is provided subject to the disclaimer at the end of this document.
Environment
RKE2
Situation
Failed to re-add one of the master nodes to the cluster after the node maintenance/OS repair
Resolution
Once the Node is ready to rejoin the cluster after the repair, the below steps has to be performed on the node.
1. Collect the token from the existing master node and adjust the config.yaml
2. Make sure all the RKE2 processes are cleaned up on the deleted node.
cd /usr/local/bin
./rke2-killall.sh
3. Run the command to do the db cleanup.
sudo rm -rf /var/lib/rancher/rke2/server/db
3. Start the RKE2 server using the binary.
sudo systemctl start rke2-server
Cause
Due to the OS corruption, one of the Master node got removed from a running RKE2 cluster.
Status
Top Issue
Additional Information
This fix can be applied on a RKE2 cluster which used the binary install method. It is not verified in the Rancher provisioned RKE2 clusters.
The token can be retrieved from the running Master node if its not the pre-shared token.
cat /var/lib/rancher/rke2/server/node-token
The basic config.yaml looks like below
cat /etc/rancher/rke2/config.yaml
server: https://xxxxxxxsx:9345
token: xxxxxxxxxx7e8068c9fe03ec::server:6cc0ffdd5a127be53031efea454xxxx
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.