Difference between draining for Kubernetes version/config updates and Node Pool config updates

Article Number: 000022059

Environment

A Rancher-provisioned RKE2 or K3s node driver cluster

Situation

During a cluster maintenance operation that involves both a Kubernetes version or configuration update and a change to the node pool configuration (e.g., updating the OS image), nodes are observed to get stuck in a "Deleting" state.

The symptoms include:

Pods running on these nodes are not automatically evicted.
The upgradeStrategy drain options defined in the cluster's specification appear to have no effect on this behaviour.

This creates confusion as to why the configured drain process in the upgradeStrategy does not occur.

Cause

It is important to understand that the replacement of nodes within a Node Pool, due to an update to the node pool configuration, and the in-place upgrade of Kubernetes version/configuration are two separate processes, managed by separate controllers within Rancher.

Node Pool update draining: This process is controlled by the Kubernetes Cluster API (CAPI). It is triggered when the configuration template for a node pool is changed. This process replaces old nodes with new ones.
In this case, the draining behaviour would be managed by the " drainBeforeDelete: true" flag in the machinePools specification. If this flag is false or absent, CAPI will not drain the node before deleting it, leading to the stuck pods and "Deleting" state.

Kubernetes version/configuration update draining: This process is controlled by Rancher's upgrade controller. It reads the upgradeStrategy section in the cluster's specification. Its purpose is to manage the draining of nodes for an in-place update (e.g., upgrading the RKE2 version). The node itself is not replaced, so the underlying machine persists. It is simply cordoned, drained, updated, and un-cordoned.

The previously described problem in the situation section occurs because the node replacement is being triggered by the Node Pool update, but the system is missing the "drainBeforeDelete: true"flag.

Resolution

To ensure nodes are drained correctly during all maintenance operations, you must configure the appropriate mechanism for both processes, the Node Pool update, as well as the Kubernetes version/configuration update. These are two separate processes:

Node Pool updates: In your cluster's provisioning resource (cluster.provisioning.cattle.io), ensure every machine pool has the drainBeforeDelete flag set to true. This option is exposed under the Show Advanced section of the Machine Pools configuration, in the Edit Config view for a cluster. While this option is enabled by default when configuring a cluster via the Rancher UI, it is important to specify it explicitly when using external automated tools such as GitOps.
Kubernetes version/configuration updates: Configure your desired drain behaviour within the upgradeStrategy section of the cluster spec. This will be respected during an in-place Kubernetes version or configuration update. This option is exposed in the Update Strategy tab of the Cluster Configuration section, in the Edit Config view for a cluster.

Most importantly, you should not perform a Kubernetes version/configuration update and a node pool update at the same time. Trigger these maintenance tasks in separate steps. For example:

First, apply the changes to your node pool configuration and wait for all nodes to be replaced successfully.
Then, apply the change for the Kubernetes version/configuration update.

Applying a Kubernetes version/configuration update and a node pool template change at the same time will trigger two parallel, competing processes. This is inefficient (a node's Kubernetes version might be upgraded in-place only for the node to be immediately deleted due to the node pool configuration update) and makes it extremely difficult to troubleshoot if an issue occurs. Always perform these operations separately.