Rancher v2.x provisioned vSphere cluster nodes stuck in provisioning, with "Waiting for SSH to be available", as a result of pre-existing cloud-init configuration in VM Template
This document (000020042) is provided subject to the disclaimer at the end of this document.
Environment
A Rancher v2.x provisioned vSphere cluster, using the vSphere Node Driver.
Situation
Upon launching a vSphere Node Driver cluster in Rancher v2.x, nodes within the cluster are stuck in provisioning, with the message Waiting for SSH to be available
. Logging into the nodes via SSH and checking the auth log directly reveals failed SSH connection attempts for a missing docker
user.
Resolution
Convert the Template back to a VM and run:
sudo cloud-init clean
This command will clean the Template of any existing cloud-inits, once complete you can convert the VM back to a template to try again.
Cause
When provisioning a vSphere Node Driver cluster Rancher v2.x uses cloud-init to generate an ssh-keypair for the user docker
and copy this into the Virtual Machine on initial boot.
In some Linux distributions, including Ubuntu Server 18.04, the standard OS installation process generates a cloud-init configuration. Installation of the OS is performed during the intitial setup of the VM Templates, prior to cluster provisioning via Rancher, and this existing cloud-init configuration within the Template can intefere with Rancher's ability to insert its own cloud-init.
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.