Unintended terraform configuration drift caused by machine pool ordering
Article Number: 000022130
Environment
Terraform provisioned RKE2 downstream cluster.
Situation
When adding a new machine pool to an existing Rancher2 RKE2 cluster, Terraform may plan to modify existing machine pools that were provisioned in previous runs, causing unintended updates to control plane and worker nodes.
Cause
Terraform maps are unordered collections. When the Rancher provider converts a map into the provider's TypeList, keys are sorted lexicographically. Machine pools are matched by position (index) inside that list — not by name. Adding a new pool changes the alphabetical order, which shifts indexes. Terraform then incorrectly associates existing list positions with different pool configurations and plans in-place updates to the wrong pools.
Impact
- Unexpected in-place modifications of existing pools.
- Potential disruption of node roles (control/etcd/worker) and pod scheduling.
Illustrative Example
Assume a scenario where existing cluster has two machine pools: control and worker.
Initial Terraform run
Map input (unordered):
control -> control_plane: true, quantity: 3
worker -> worker_role: true, quantity: 2
Lexicographical sort creates ordered list:
[0] control -> creates control pool
[1] worker -> creates worker pool
Adding new pool NEW-POOL
Map input (unordered):
control -> control_plane: true, quantity: 3
worker -> worker_role: true, quantity: 2
NEW-POOL -> worker_role: true, quantity: 1
Lexicographical sort creates new ordered list:
[0] NEW-POOL -> (index 0 already exists)
[1] control -> (index 1 already exists)
[2] worker -> new
Result (incorrect):
-
Terraform sees index mismatches and plans to:
-
Modify existing pool at index 0 (previously control) to NEW-POOL configuration.
- Modify existing pool at index 1 (previously worker) to control configuration.
- Create a new worker pool at index 2.
Expected: Create one new NEW-POOL pool without modifying control or worker.
Example Terraform plan snippet (illustrative)
# rancher2_cluster_v2.cluster_rke2 will be updated in-place
~ resource "rancher2_cluster_v2" "cluster_rke2" {
~ rke_config {
# Index [0]: Existing "control" pool → incorrectly changed to "NEW-POOL"
~ machine_pools {
~ name = "control" -> "NEW-POOL"
~ control_plane_role = true -> false
~ etcd_role = true -> false
~ worker_role = false -> true
~ quantity = 3 -> 1
~ machine_labels = {
~ "nodepool" = "control" -> "worker"
}
}
# Index [1]: Existing "worker" pool → incorrectly changed to "control"
~ machine_pools {
~ name = "worker" -> "control"
~ control_plane_role = false -> true
~ etcd_role = false -> true
~ worker_role = true -> false
~ quantity = 2 -> 3
~ machine_labels = {
~ "nodepool" = "worker" -> "control"
}
}
# Index [2]: New pool created as "worker" (expected "NEW-POOL")
+ machine_pools {
+ name = "worker"
+ control_plane_role = false
+ worker_role = true
+ quantity = 2
}
}
}
Plan: 0 to add, 1 to change, 0 to destroy.
Resolution
Stable ordering of keys is the simplest and most reliable workaround until a provider-level fix is available.
- Use ordered keys (prefix keys with numbers) in the map so that lexicographical sorting produces a stable list order. Example:
1-control:
name: control
control_plane_role: true
quantity: 3
2-worker:
name: worker
worker_role: true
quantity: 2
3-NEW-POOL:
name: NEW-POOL
worker_role: true
quantity: 1
-
When adding pools:
-
Add the new entry with a key that maintains the intended final alphabetical order (using numeric prefixes as above). Test in a non-production environment before applying to production clusters.