Workflow Maintenance

Set Up Monitoring

Workflow for adding monitoring and alerting to a service

maintenancemonitoringobservability
CLAUDE.md

When setting up monitoring for a service:

  1. Add a health check endpoint that verifies the service and its critical dependencies are reachable.
  2. Add structured logging with consistent fields: timestamp, level, message, request_id.
  3. Add metrics for: request count, error rate, response time (p50, p95, p99), and active connections.
  4. Set up dashboards showing these metrics over time.
  5. Configure alerts for: error rate exceeding baseline by 2x, p99 latency exceeding SLA, health check failures.
  6. Ensure each alert has a runbook linked in the alert message.
  7. Test the alerting pipeline: trigger a test alert and verify it reaches the on-call channel.

Copy this workflow into your CLAUDE.md or agent config file so your agent follows this process automatically.

get crystl