Skip to content

OOM (Out Of Memory), high memory consumption basic troubleshooting steps

Article Number: 000021755

Environment

Rancher 2.x

Situation

Memory consumption on the nodes is too high, or OOM kill is happening frequently.

At the Kubernetes level
Start with kubectl top as it should tell what is consuming memory at the point in time:

# check which pods are consuming most memory
kubectl top pods 
# check which nodes are affected
kubectl top nodes

A few questions that can help are:

  • Which pods are consuming the most resources?
  • Is it on a specific node, or across all nodes?
  • Describing the node, is it over-provisioned?

This might give opportunities for better capacity planning for your applications.

At the node level
Check the messages (or with dmesg -T) for the OOM Kill message:

  • If invoked by cgroup, it means that limits are being respected. Adjust them as needed.
  • If invoked by the kernel, it means that the node is running out of memory and OOM is reclaiming it

Check the kubelet logs for OOM  kills.

Cause

OOM kills or high memory usage might be caused by lack of resources, configuration issues or application failures.

Resolution

Rancher Project Resource Quotas:
Rancher allows for resource management at the Project level. Please review the documentation on how to set limits at the Project and Namespace levels. 

For non-Rancher components:
Adjust the requests and limits as per the Kubernetes documentation. It can be done at many levels. At spec.container, or even on the values.yaml. Here is an example from Rancher Monitoring:

resources:
      limits:
        memory: 500Mi
        cpu: 1000m
      requests:
        memory: 100Mi
        cpu: 100m

If you are experiencing issues with Rancher-shipped components, open a case with Rancher Support. Please collect all the data below when contacting SUSE Rancher support.