SSL Certificate Monitoring with Prometheus and Grafana
Ask any engineer who’s dealt with an expired certificate in production and they’ll tell you the same thing: the alert that matters is the one that fires before everything breaks, not the PagerDuty that wakes you up at 3 AM. SSL certificate expiration is completely predictable — you know the exact moment it expires from day one — yet it remains one of the most common causes of avoidable outages.
If you’re already running Prometheus and Grafana for infrastructure observability, adding certificate monitoring is a natural extension. This guide walks through the full setup: exporters, alert rules, Grafana dashboards, and Alertmanager routing, so certificates become just another metric in your observability stack.
For the Kubernetes side of certificate management, cert-manager automates the entire issuance and renewal workflow and plays nicely with Prometheus metrics — which we’ll cover later in this guide.
Monitoring Architecture
[Endpoints] → [Exporters] → [Prometheus] → [Grafana]
↓
[Alertmanager]
SSL/TLS Exporters for Prometheus
1. ssl_exporter
Most popular SSL/TLS exporter.
Installation:
# Binary release
wget https://github.com/ribbybibby/ssl_exporter/releases/download/v2.4.2/ssl_exporter-2.4.2.linux-amd64.tar.gz
tar xvf ssl_exporter-2.4.2.linux-amd64.tar.gz
sudo mv ssl_exporter /usr/local/bin/
# Service
sudo cat > /etc/systemd/system/ssl_exporter.service <<EOF
[Unit]
Description=SSL Exporter
After=network.target
[Service]
Type=simple
User=prometheus
ExecStart=/usr/local/bin/ssl_exporter \
--web.listen-address=:9219 \
--web.metrics-path=/metrics
[Install]
WantedBy=multi-user.target
EOF
sudo systemctl daemon-reload
sudo systemctl enable ssl_exporter
sudo systemctl start ssl_exporter
Prometheus Configuration:
# prometheus.yml
scrape_configs:
- job_name: 'ssl'
metrics_path: /probe
static_configs:
- targets:
- example.com:443
- api.example.com:443
- www.example.com:443
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: localhost:9219
2. blackbox_exporter
Universal exporter that can also check SSL.
Configuration:
# blackbox.yml
modules:
ssl_expiry:
prober: tcp
timeout: 5s
tcp:
tls: true
tls_config:
insecure_skip_verify: false
Prometheus config:
scrape_configs:
- job_name: 'blackbox'
metrics_path: /probe
params:
module: [ssl_expiry]
static_configs:
- targets:
- https://example.com
- https://api.example.com
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: localhost:9115
Key Metrics
ssl_exporter Metrics
# Expiration date (timestamp)
ssl_cert_not_after
# Issue date
ssl_cert_not_before
# Certificate information
ssl_cert_subject_common_name
ssl_cert_issuer_common_name
# Verification status
ssl_tls_connect_success
ssl_prober_success
# TLS version
ssl_tls_version_info
Calculating days until expiration
# Days to expiration
(ssl_cert_not_after - time()) / 86400
# Hours to expiration
(ssl_cert_not_after - time()) / 3600
Prometheus Alerts
Alert Rules
# prometheus-rules.yml
groups:
- name: ssl_certificates
rules:
# Alert 30 days before expiration
- alert: SSLCertExpiringSoon
expr: (ssl_cert_not_after - time()) / 86400 < 30
for: 24h
labels:
severity: warning
annotations:
summary: "SSL certificate expiring soon for {{ $labels.instance }}"
description: "Certificate expires in {{ $value }} days"
# Alert 7 days before expiration
- alert: SSLCertExpiringCritical
expr: (ssl_cert_not_after - time()) / 86400 < 7
for: 1h
labels:
severity: critical
annotations:
summary: "SSL certificate expiring CRITICAL for {{ $labels.instance }}"
description: "Certificate expires in {{ $value }} days!"
# Certificate expired
- alert: SSLCertExpired
expr: ssl_cert_not_after - time() < 0
labels:
severity: critical
annotations:
summary: "SSL certificate EXPIRED for {{ $labels.instance }}"
description: "Certificate has expired!"
# Probe failed
- alert: SSLProbeFailed
expr: ssl_prober_success == 0
for: 5m
labels:
severity: critical
annotations:
summary: "SSL probe failed for {{ $labels.instance }}"
description: "Cannot verify SSL certificate"
Alertmanager Configuration
# alertmanager.yml
global:
resolve_timeout: 5m
route:
group_by: ['alertname', 'cluster']
group_wait: 10s
group_interval: 10s
repeat_interval: 12h
receiver: 'team-ssl'
receivers:
- name: 'team-ssl'
email_configs:
- to: 'ssl-team@example.com'
from: 'alertmanager@example.com'
smarthost: 'smtp.example.com:587'
slack_configs:
- api_url: 'https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK'
channel: '#ssl-alerts'
title: 'SSL Certificate Alert'
Grafana Dashboards
Import Dashboard
- Grafana → Dashboards → Import
- ID: 14662 (SSL/TLS Certificate Dashboard)
- Select Prometheus datasource
- Import
Custom Panels
Panel 1: Certificate Expiry Table
sort_desc((ssl_cert_not_after - time()) / 86400)
Panel 2: Certificate Age
(time() - ssl_cert_not_before) / 86400
Panel 3: TLS Version Distribution
count by (ssl_tls_version) (ssl_tls_version_info)
Docker Compose Stack
version: '3.8'
services:
prometheus:
image: prom/prometheus:latest
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- ./rules.yml:/etc/prometheus/rules.yml
- prometheus-data:/prometheus
ports:
- "9090:9090"
ssl_exporter:
image: ribbybibby/ssl-exporter:latest
ports:
- "9219:9219"
grafana:
image: grafana/grafana:latest
environment:
- GF_SECURITY_ADMIN_PASSWORD=admin
volumes:
- grafana-data:/var/lib/grafana
ports:
- "3000:3000"
depends_on:
- prometheus
alertmanager:
image: prom/alertmanager:latest
volumes:
- ./alertmanager.yml:/etc/alertmanager/alertmanager.yml
ports:
- "9093:9093"
volumes:
prometheus-data:
grafana-data:
Best Practices
Wondering what this looks like at scale? Our case study on certificate automation shows how a team managing 200+ certificates across three cloud providers built exactly this kind of monitoring stack — and what happened before they did.
1. Scrape Frequency
# Not too often - certificates don't change every minute
scrape_interval: 5m # 5 minutes is enough
scrape_timeout: 10s
2. Alert Grouping
# Group alerts by domain
route:
group_by: ['instance', 'alertname']
group_wait: 30s
group_interval: 5m
3. Retention
# Prometheus
--storage.tsdb.retention.time=90d # 3 months of history
Integrations
Slack
slack_configs:
- api_url: 'https://hooks.slack.com/services/YOUR/WEBHOOK'
channel: '#ssl-alerts'
title: '{{ .GroupLabels.alertname }}'
text: '{{ range .Alerts }}{{ .Annotations.description }}{{ end }}'
PagerDuty
pagerduty_configs:
- service_key: 'YOUR_KEY'
description: '{{ .GroupLabels.alertname }}'
Troubleshooting
Exporter not working
# Check logs
journalctl -u ssl_exporter -f
# Test manually
curl http://localhost:9219/probe?target=example.com:443
No metrics in Prometheus
# Check targets
curl http://localhost:9090/api/v1/targets
# Query metrics
curl 'http://localhost:9090/api/v1/query?query=ssl_cert_not_after'
Integrating SSL/TLS monitoring with Prometheus and Grafana gives you visibility (all certificates in one place), proactivity (alerts before anything expires), history (track changes and renewals over time), and easy CI/CD integration. It’s the foundation every mature infrastructure needs.
One more thing to keep in mind: certificate validity periods are getting shorter. The industry is moving toward 47-day certificates over the next few years — which means monitoring and automation become even more critical, not less. Get your stack set up now, while you have the breathing room.
Combine this with tools like CrtMgr for additional external monitoring and you have a complete observability stack for certificates. Internal metrics tell you what your infrastructure sees; external monitoring tells you what your users see.
You can’t improve what you don’t measure — and you can’t sleep well without monitoring your certificates.