Longhorn Node Maintenance Guideline
Article Number: 000021085
Environment
- SUSE Storage 1.4+
Situation
This instruction describes how to handle planned node maintenance.
Resolution
These are the steps needed to shut down all volumes without having to modify the scale of the end user/application deployments in the cluster.
- Cordon the node. Longhorn will automatically disable node scheduling when a Kubernetes node is cordoned.
-
Drain the node to move the workload to another node. You will need to use any of the following documentation for the respective commands, depending on the version:
-
For Longhorn before 1.4.x (doc)
- For Longhorn 1.4.x+ (doc) - the drain command has been simplified
This will evict the workloads from the draining node.
- The replica processes on the node will be stopped at this stage. Replicas on the node will be shown as
Failed
.
Note: By default, if there is one last healthy replica for a volume on the node, Longhorn will prevent the node from completing the drain operation, to protect the last replica and prevent the disruption of the workload. You can either override the behavior in the setting, or evict the replica to other nodes before draining.
- The engine processes on the node will be migrated with the Pod to other nodes.
Note: If there are Longhorn volumes that are manually attached by Longhorn UI on the node, Longhorn will prevent the node from completing the drain operation, please detach these volume using Longhorn UI.
- After the
drain
is completed, there should be no engine or replica process running on the node. Two instance managers will still be running on the node, but they’re stateless and won’t cause interruption to the existing workload. - Perform the necessary maintenance, including shutting down or rebooting the node.
- Uncordon the node. Longhorn will automatically re-enable the node scheduling.
Important note 1: Always refer to the respective product documentation for clear steps and details. The KB content is only a pointer to the main aspects of the current issue being addressed.
Important note 2: Always refer to the documentation of the specific product version. Many features and functions have changed, so for the most accurate information, you have to ensure you're looking into the correct version documentation.