Skip to content

How to perform graceful Node Shutdown in RKE2

Article Number: 000022104

Environment

  • RKE2

Procedure

To perform a graceful node shutdown in RKE2 for maintenance scenarios, follow these steps to cordon and drain the node:

  • Mark the node as unschedulable using the command:

kubectl cordon <node name>
- Drain the node to evict all pods, including those with Pod Disruption Budgets, using the command:

kubectl drain <node name> --ignore-daemonsets --force

On worker nodes

1. Stop the rke2-agent service: 

sudo systemctl stop rke2-agent

2. Check for any remaining container processes that should be stopped:

sudo ps auxfww

On control pane / etcd nodes

1. Stop the rke2-server service: 

sudo systemctl stop rke2-server 

2. Check for any remaining container processes that should be stopped: 

sudo ps auxfww

Stop remaining processes

If all application workloads have been stopped, this is not needed before shutting down the node, however, in some cases, it may be useful to stop all remaining container processes and components like containerd.

  • Verify that no application workloads are running on the node 

kubectl describe node <node name>
- If all application workloads have been scheduled on other nodes, leftover container processes and all the related RKE2 processes can be stopped using the rke2-killall.sh script:

sudo /usr/local/bin/rke2-killall.sh

Note: The rke2-killall.sh script uses SIGKILL to terminate processes, which may negatively impact stateful application workloads that may still be running. For stateful workloads, consider a solution that sends SIGTERM with a timeout before resorting to SIGKILL. For related information, refer to the documentation: Best practices for RKE2 cluster maintenance. Always, make sure to capture an etcd snapshot before performing any node maintenance activity.

Start the service again

  • After maintenance, start the service:
    • Worker (agent) nodes: sudo systemctl start rke2-agent
  • Control plane (server) nodes: sudo systemctl start rke2-server
  • Mark the node as schedulable again:
kubectl uncordon <node name>