How to Configure Alerts in Prometheus and Visualize Them in Grafana: A Step-by-Step Guide
Prometheus not only monitors your application’s performance but can also be configured to send alerts when certain conditions are met, such as high CPU usage or memory exhaustion. Alerts are crucial for real-time monitoring and system reliability. In this guide, we’ll configure Prometheus to trigger alerts based on a metric threshold (e.g., CPU usage greater than 80%) and display the alerts in Grafana.
Prerequisites
- Prometheus and Grafana are installed and running (as described in previous blogs).
- A basic understanding of Prometheus metrics and alerts.
- Alertmanager installed to handle alert notifications.
- Docker knowledge (optional but useful for managing Alertmanager).
Step 1: Install Alertmanager
Prometheus uses Alertmanager to handle alerts. It routes alerts based on configuration to various receivers such as Slack, email, or webhooks.
Option 1: Install Alertmanager Using Docker (Recommended)
- To pull and run Alertmanager using Docker, execute the following command:
docker run -d --name=alertmanager -p 9093:9093 prom/alertmanager
2. Once Alertmanager is running, it will be available at http://localhost:9093.
Option 2: Install Alertmanager Manually
- Download Alertmanager from the official Prometheus downloads page.
- Extract the tarball:
tar -xvzf alertmanager-*.tar.gz
3. Start Alertmanager:
./alertmanager --config.file=alertmanager.yml
Step 2: Configure Alerting Rules in Prometheus
Now we need to configure Prometheus to trigger alerts based on certain conditions, such as CPU usage exceeding 80%.
- Create a file called
alert.rules.yml
in the Prometheus directory with the following content:
groups:
- name: example
rules:
- alert: HighCPUUsage
expr: 100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[1m])) * 100) > 80
for: 1m
labels:
severity: critical
annotations:
summary: "High CPU usage detected on {{ $labels.instance }}"
description: "CPU usage has been over 80% for the past 1 minute on {{ $labels.instance }}."
2. This rule triggers an alert when CPU usage exceeds 80% for over 1 minute.
3. Update your prometheus.yml
configuration file to include the alert rules:
rule_files:
- "alert.rules.yml"
alerting:
alertmanagers:
- static_configs:
- targets:
- "localhost:9093" # Address of the Alertmanager instance
4. Restart Prometheus to apply the changes:
./prometheus --config.file=prometheus.yml
If using Docker, restart the Prometheus container:
docker restart prometheus
Step 3: Set Up Alertmanager
Alertmanager needs to be configured to handle the alerts triggered by Prometheus.
- Create or modify
alertmanager.yml
with basic notification configurations. Here's an example configuration that will route alerts to a Slack channel or email (adjust based on your notification preferences):
route:
receiver: 'email'
receivers:
- name: 'email'
email_configs:
- to: 'your-email@example.com'
from: 'alertmanager@example.com'
smarthost: 'smtp.example.com:587'
auth_username: 'username'
auth_password: 'password'
2. Start or restart Alertmanager with the configuration:
./alertmanager --config.file=alertmanager.yml
Or if you’re using Docker:
docker restart alertmanager
Step 4: Verify Alerts in Prometheus
- Open Prometheus UI at http://localhost:9090.
- Navigate to the Alerts section to see a list of active and inactive alerts.
- To simulate high CPU usage, you can either:
- Stress the CPU of your system using a tool like
stress
. - Adjust the PromQL expression temporarily to trigger an alert.
Once CPU usage exceeds the threshold, the alert will move from Pending to Firing.
Step 5: Display Alerts in Grafana
Grafana can visualize alerts from Prometheus directly, providing a single dashboard for monitoring and alerts.
- Go to Configuration (gear icon) > Data Sources in Grafana.
- Select the Prometheus data source.
- Scroll down to the Alerting section and enable it if it’s not already enabled.
Create a Panel for Alerts
- In the Dashboard, create a new panel.
- In the query section, use PromQL to display alert information:
ALERTS
3. Visualize it as a Table or Graph based on your preference.
4. Click Apply to add the panel to your dashboard.
Now you’ll be able to see active alerts directly in Grafana.
Conclusion
In this blog, we have successfully configured Prometheus to trigger alerts based on specific metric thresholds, such as high CPU usage, and routed those alerts through Alert manager. Finally, we displayed the alerts in Grafana to create a comprehensive monitoring and alerting dashboard.