Jobs apply-system-agent-upgrader on nodes keep failing
This document (000021976) is provided subject to the disclaimer at the end of this document.
Environment
Rancher-provisioned RKE2 downstream clusters
Situation
Jobs apply-system-agent-upgrader-on every node are failing in newly Rancher-provisioned RKE2 downstream clusters with an error
+ CATTLE_AGENT_VAR_DIR=/var/lib/rancher/agent
+ TMPDIRBASE=/var/lib/rancher/agent/tmp
+ mkdir -p /host/var/lib/rancher/agent/tmp
mkdir: cannot create directory '/host/var/lib/rancher': File exists
It indicates that the directory /host/var/lib/rancher cannot be created because the file already exists. This occurs because /var/lib/rancher is a symlink on the host.
Resolution
Delete the downstream cluster and recreate it without the symlink on nodes. Alternatively, remove the /var/lib/rancher symlink before creating the cluster.
Cause
The issue is caused by a pre-existing symlink for /var/lib/rancher that prevents the RKE2 cluster from creating the directory.
A symlink (symbolic link) is a pointer to another file or directory on the same filesystem. Because the pod's filesystem is separate from the host's, a symlink created on the host will point to a location on the host's filesystem. When the pod tries to follow that link, it can't find the target because the path doesn't exist within the pod's isolated filesystem. This is a security and isolation feature of Kubernetes and containerization in general.
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.