Set Up Monitoring
Workflow for adding monitoring and alerting to a service
CLAUDE.md
When setting up monitoring for a service:
- Add a health check endpoint that verifies the service and its critical dependencies are reachable.
- Add structured logging with consistent fields: timestamp, level, message, request_id.
- Add metrics for: request count, error rate, response time (p50, p95, p99), and active connections.
- Set up dashboards showing these metrics over time.
- Configure alerts for: error rate exceeding baseline by 2x, p99 latency exceeding SLA, health check failures.
- Ensure each alert has a runbook linked in the alert message.
- Test the alerting pipeline: trigger a test alert and verify it reaches the on-call channel.
Copy this workflow into your CLAUDE.md or agent config file so your agent follows this process automatically.