Failed ETCD snapshot restoration leads the cluster into stuck "paused" state

This document (000021399) is provided subject to the disclaimer at the end of this document.

Environment

Rancher Server 2.7.6 and above

Situation

In some cases, the downstream cluster can get into a broken state which requires a Disaster Recovery process to bring it back to its active state.

At some point, the DR process does not finish properly and hangs up indefinitely which leads the cluster into what is called a "paused" state.

This symptom can be seen by checking the clusters.cluster.x-k8s.io object in the fleet-default namespace from the local (upstream) cluster.

kubectl get clusters.cluster.x-k8s.io <CLUSTER_NAME> -n fleet-default -o yaml

In the yaml output, you should see the .spec.paused field being set to true.

Resolution

To unblock this situation, the following steps are recommended to perform:

- edit the clusters.cluster.x-k8s.io object in the fleet-default namespace from the local (upstream) cluster

kubectl edit clusters.cluster.x-k8s.io <CLUSTER_NAME> -n fleet-default -o yaml

- refer to the .spec.paused field being set to false

- save the file and exit

The above steps will instruct Rancher to unpause the cluster or unblock the stuck situation to continue doing the restore process.

The recommended approach would be performing the DR process again after the edit is made.

Right after this, please refer to Rancher Manager backup and restore docs here to continue the DR process depending on the distribution in use (RKE/RKE2/K3S).

Cause

an unforeseen incident (network, OS failure etc...) led the cluster into a broken state.

an outage that made all Control Plane nodes completely unavailable.

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.