Loading...جاري التحميل...

Check

check.sh

# Active silences (amtool is inside the container)
docker exec via_prod-alertmanager amtool silence --alertmanager.url=http://localhost:9093 query

# Anything firing but suppressed?
curl -s http://localhost:9093/api/v2/alerts?silenced=true | jq '.[].labels'

# Sanity: is Prometheus seeing rules at all?
curl -s http://localhost:9090/api/v1/rules | jq '.data.groups[].rules[] | select(.state=="firing")'

Remediation

List every silence and expire any whose window has passed but status is still `active`:

expire.sh

# Expire a specific silence by ID
docker exec via_prod-alertmanager amtool silence --alertmanager.url=http://localhost:9093 expire <silence-id>

# Or nuke them all (use sparingly)
for id in $(docker exec via_prod-alertmanager amtool silence --alertmanager.url=http://localhost:9093 query -q); do
  docker exec via_prod-alertmanager amtool silence --alertmanager.url=http://localhost:9093 expire "$id"
done

If notifications still do not arrive, re-verify the delivery chain — Slack webhook, PagerDuty integration key, SMTP relay.

Fire a synthetic test alert:

test.sh

curl -X POST http://localhost:9093/api/v2/alerts -H 'Content-Type: application/json' -d '[{
  "labels": {"alertname":"Heartbeat","severity":"warn"},
  "annotations":{"summary":"runbook ping"}
}]'

All runbooks

Runbook

Alertmanager globally silenced

Symptoms

No alerts in `#ops-alerts` for several hours despite normal traffic.
Prometheus shows `ALERTS{alertstate="firing"}` but Alertmanager is not routing them.