Skip to content

Rancher v2.x provisioned vSphere cluster nodes stuck in provisioning, with "Waiting for SSH to be available", as a result of pre-existing cloud-init configuration in VM Template

Article Number: 000020042

Environment

A Rancher v2.x provisioned vSphere cluster, using the vSphere Node Driver.

Situation

Upon launching a vSphere Node Driver cluster in Rancher v2.x, nodes within the cluster are stuck in provisioning, with the message Waiting for SSH to be available. Logging into the nodes via SSH and checking the auth log directly reveals failed SSH connection attempts for a missing docker user.

Cause

When provisioning a vSphere Node Driver cluster Rancher v2.x uses cloud-init to generate an ssh-keypair for the user docker and copy this into the Virtual Machine on initial boot.

In some Linux distributions, including Ubuntu Server 18.04, the standard OS installation process generates a cloud-init configuration. Installation of the OS is performed during the intitial setup of the VM Templates, prior to cluster provisioning via Rancher, and this existing cloud-init configuration within the Template can intefere with Rancher's ability to insert its own cloud-init.

Resolution

Convert the Template back to a VM and run:

sudo cloud-init clean

This command will clean the Template of any existing cloud-inits, once complete you can convert the VM back to a template to try again.