Skip to content

How to test Alertmanager in rancher-monitoring

Article Number: 000022097

Environment

A Kubernetes cluster managed by Rancher v2.6+ with rancher-monitoring installed

Procedure

This guide demonstrates how to test Alertmanager and PrometheusRule configuration, to validate that alerts are sent successfully by Alertmanager.

With this objective in mind, and for this test to be self-contained, a webhook receiver is configured in Alertmanager. A webhook-receiver pod is deployed to receive these webhook alert requests and print them to stdout, such that they are visible in the Pod logs for verification. All of these resources are created in the cattle-monitoring-system.

  1. Navigate to a Rancher-managed cluster with rancher-monitoring installed.
  2. Apply the following YAMLs:

  3. ConfigMap:

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: webhook-receiver-configmap-script
      namespace: cattle-monitoring-system
    data:
      receiver.sh: |
        #!/bin/bash
        TIMEOUT_SEC=1
    
        echo "Starting nc listener with $TIMEOUT_SEC second timeout..."
    
        while true; do
          REQUEST_FILE="/tmp/request.log"
    
          echo "Waiting for connection..."
    
          # Reads the HTTP POST request and prints it to a file.
          (
            printf "HTTP/1.1 200 OK\r\nConnection: close\r\nContent-Type: text/plain\r\n\r\nRequest successfully logged.\n"
          ) | nc -l -p 8080 -w $TIMEOUT_SEC > $REQUEST_FILE
    
    
          # Prints to stdout
          echo -e "\n--- RECEIVED FULL REQUEST ---\n"
          cat $REQUEST_FILE
          echo -e "\n--- END OF REQUEST ---\n"
    
          # Erases the temp file to end the loop
          rm -f $REQUEST_FILE
          sleep 0.1
        done
    
    2. Pod:

    apiVersion: v1
    kind: Pod
    metadata:
      name: webhook-receiver
      namespace: cattle-monitoring-system
      labels:
        app: webhook-receiver
    spec:
      containers:
      - name: receiver-container
        image: rancherlabs/swiss-army-knife:latest
        command: ["/bin/bash", "/script/receiver.sh"]
        ports:
        - containerPort: 8080
        volumeMounts:
        - name: receiver-script-volume
          mountPath: /script
      volumes:
      - name: receiver-script-volume
        configMap:
          name: webhook-receiver-configmap-script
          defaultMode: 0744
    
    3. Service:

    apiVersion: v1
    kind: Service
    metadata:
      name: webhook-receiver-service
      namespace: cattle-monitoring-system
    spec:
      selector:
        app: webhook-receiver
      ports:
        - protocol: TCP
          port: 80
          targetPort: 8080
      type: ClusterIP
    
    3. Ensure that the pod is up and tail the log, you should see a couple of lines stating that the netcat listener is ready and waiting for a connection. The Alertmanager alert configured below will be visible in these logs. 4. Apply the following AlertmanagerConfig to configure Alertmanager to send any alerts with the label "severity=critical" to the webhook-receiver pod (the Alertmanager configuration documentation can be found here). Note that the URL used is that of the service created above:

apiVersion: monitoring.coreos.com/v1alpha1
kind: AlertmanagerConfig
metadata:
  name: webhook-receiver-am-config
  namespace: cattle-monitoring-system
spec:
  receivers:
    - name: webhook-receiver-pod
      webhookConfigs:
        - url: http://webhook-receiver-service/
          sendResolved: true
  route:
    receiver: webhook-receiver-pod
    routes:
      - matchers:
          - name: severity
            value: critical
        receiver: webhook-receiver-pod
        continue: false
5. Create a PrometheusRule with an alert expression. This example uses vector(1) as the expression, such that its value will be always "1" and the alert will be trigged continuously:

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: test-rule
  namespace: cattle-monitoring-system
spec:
  groups:
    - name: test-rule
      rules:
        - alert: test-alert
          expr: vector(1)
          for: 0s
          labels:
            namespace: cattle-monitoring-system
            severity: critical
6. Wait for the alert to appear in the Alertmanager Alerts UI. 7. Check the log of the webhook-receiver pod and observe that the test-rule alert is received, similar to the following:

Starting nc listener with 1 second timeout...
Waiting for connection...

--- RECEIVED FULL REQUEST ---

POST / HTTP/1.1
Host: webhook-receiver-service
User-Agent: Alertmanager/0.28.1
Content-Length: 1214
Content-Type: application/json

{"receiver":"cattle-monitoring-system/webhook-receiver-am-config/webhook-receiver-pod","status":"firing","alerts":[{"status":"firing","labels":{"alertname":"test-alert","namespace":"cattle-monitoring-system","prometheus":"cattle-monitoring-system/rancher-monitoring-prometheus","severity":"critical"},"annotations":{},"startsAt":"2025-10-14T09:04:11.437Z","endsAt":"0001-01-01T00:00:00Z","generatorURL":"https://142.93.230.60.nip.io/k8s/clusters/c-m-d2xdbdjr/api/v1/namespaces/cattle-monitoring-system/services/http:rancher-monitoring-prometheus:9090/proxy/graph?g0.expr=vector%281%29\u0026g0.tab=1","fingerprint":"163a7e819a18ef74"}],"groupLabels":{"namespace":"cattle-monitoring-system"},"commonLabels":{"alertname":"test-alert","namespace":"cattle-monitoring-system","prometheus":"cattle-monitoring-system/rancher-monitoring-prometheus","severity":"critical"},"commonAnnotations":{},"externalURL":"https://142.93.230.60.nip.io/k8s/clusters/c-m-d2xdbdjr/api/v1/namespaces/cattle-monitoring-system/services/http:rancher-monitoring-alertmanager:9093/proxy","version":"4","groupKey":"{}/{namespace=\"cattle-monitoring-system\"}/{severity=\"critical\"}:{namespace=\"cattle-monitoring-system\"}","truncatedAlerts":0}

--- END OF REQUEST ---

Following this method, it is possible to test Alertmanager and PrometheusRule configurations without needing a third party app or configuring an external receiver. This is useful to see if the alerts arrive as expected or if they are not being sent. If you are struggling to correctly apply an AlertmanagerConfig, you can check the rancher-monitoring-operator pod logs, in order to check that the syntax is correct and was accepted; the Alertmanager pod logs; as well as the value of the PrometheusRule expression, using the Prometheus Query UI, to confirm whether the alert should currently trigger.