Skip to content

Cluster agent is not connected after accidentally deleting the cattle-system Namespace of a downstream cluster

Article Number: 000021478

Environment

Rancher 2.7.x, 2.8.x

Situation

After accidentally deleting the cattle-system Namespace of a downstream-cluster, the cluster is no longer accesible in Rancher UI due to the cluster agent being removed. To recover it, the cluster agent must be manually recreated and the cluster service account token updated.
image.png

Requierements

  • Rancher Management 

  • Kubectl CLI and kubeconfig file.

  • Downstream cluster

  • SSH acces to controlplane.

  • Kubectl CLI and kubeconfig file.

Resolution

RKE2 custom and node-driver clusters

Redeploy the Rancher agents

Steps     1.1 Connect to the affected downstream cluster.

kubectl get  mutatingwebhookconfigurations rancher.cattle.io -oyaml > backup-mutatingwebhookconfigurations.yaml

kubectl get  validatingwebhookconfigurations rancher.cattle.io -oyaml > backup-validatingwebhookconfigurations.yaml
kubectl delete mutatingwebhookconfigurations rancher.cattle.io 
kubectl delete validatingwebhookconfigurations rancher.cattle.io

Note: these objects will be recreated once the cluster is connected again.

1.2 Manually redeploy the agent

Follow the steps described in this section to redeploy the Rancher agents:

The namespace cattle-system and the cluster agent will be recreated:
image.png
image.png

1.3 Force a cluster reconciliation Apply a minor change in the cluster configuration, such as changing the snap retention for etcd.

  1. Click ☰ > Cluster Management.
  2. Go to the cluster you want to configure and click ⋮ > Edit Config.
  3. Cluster Configuration > etcd > Increase the number of Snapshots per node.