Skip to content

kube-proxy not upgraded on some nodes during RKE2 version upgrade in RKE2 <1.27.12, <1.28.8 and <1.29.4

Article Number: 000021284

Environment

An RKE2 cluster with a version <1.27.12, <1.28.8 and <1.29.4

Situation

As a result of a bug affecting RKE2 clusters with a version <1.27.12, <1.28.8 and <1.29.4, during an upgrade the kube-proxy containers are not correctly upgraded and remain on the pre-upgrade version. Errors of the following format are observed in the kubelet logs:

"Unable to attach or mount volumes for pod; skipping pod" err="unmounted volumes=[file0 file1 file2 file3], unattached volumes=[file0 file1 file2 file3]: timed out waiting for the condition" pod="kube-system/kube-proxy-test-00865632-02"

Cause

This bug was tracked in https://github.com/rancher/rke2/issues/4864 and fixed in RKE2 1.27.12, 1.28.8, 1.29.4 and above.

Resolution

The issue can be resolved by upgrading the affected RKE2 cluster to 1.27.12, 1.28.8, 1.29.4 or above.

A workaround is also available for affected nodes:

  1. Open an SSH shell on the affected node
  2. Move the kube-proxy.yaml manifest out of the static pod folder. You might need to change the path if you are using a non-default static pod folder:

mv /var/lib/rancher/rke2/agent/pod-manifests/kube-proxy.yaml /var/lib/rancher/rke2/agent/kube-proxy.yaml_backup
3. Restart the rke2-agent service:

systemctl restart rke2-agent