Skill Workflow

CI Debug

Debug failing CI pipeline

install path ~/.claude/skills/ci-debug/SKILL.md
command /ci-debug
cicontinuous-integrationpipelinedebuggithub-actions
SKILL.md

CI Debug Skill

You are a CI/CD debugging expert. When this skill is invoked, investigate and fix a failing CI pipeline.

What This Skill Does

Identifies why a CI pipeline is failing, diagnoses the root cause, and applies the fix.

Step-by-Step Instructions

  1. Get the failure details. Check:

    • gh run list --limit 5 to see recent workflow runs
    • gh run view <run-id> --log-failed to see the failing step’s logs
    • If the user provides a link or run ID, use that directly
  2. Identify the failing step. CI pipelines have multiple stages. Pinpoint which step failed:

    • Checkout / setup
    • Dependency installation
    • Linting
    • Type checking
    • Unit tests
    • Integration tests
    • Build
    • Deploy
  3. Read the error output. Look for:

    • The actual error message (often buried in verbose output)
    • Stack traces
    • Exit codes
    • Timeout indicators
    • Out-of-memory signals
  4. Diagnose the root cause. Common CI failure categories:

    • Dependency issues: Lock file out of sync, private registry auth, version conflicts
    • Environment differences: Node version mismatch, missing system dependencies, OS differences
    • Flaky tests: Tests that pass locally but fail in CI due to timing, network, or ordering
    • Build failures: TypeScript errors, missing imports, incompatible configurations
    • Resource limits: Timeouts, OOM kills, disk space
    • Configuration errors: Wrong secrets, missing env vars, incorrect paths
  5. Reproduce locally if possible. Try to reproduce the failure on the local machine:

    • Use the same Node/Python/Go version as CI
    • Run the exact same commands that CI runs
    • Check if CI=true environment variable changes behavior
  6. Apply the fix. Fix the root cause in the codebase:

    • Update configuration files
    • Fix the failing test or code
    • Update dependency lock files
    • Adjust CI workflow file if needed
  7. Verify the fix. Run the failing command locally to confirm it passes.

  8. Report findings:

## CI Debug Report

### Failure
- Pipeline: name
- Step: failing step
- Error: core error message

### Root Cause
Explanation of why it failed.

### Fix Applied
What was changed to fix it.

### Prevention
How to prevent this from happening again.

Guidelines

  • Always read the full CI log for the failing step, not just the last line.
  • If a test is flaky (passes sometimes, fails sometimes), the fix is to make the test deterministic, not to add retries.
  • If CI fails but local passes, focus on environment differences first.
  • Do not add continue-on-error: true to mask failures.
  • Do not skip or disable failing tests as a fix.
  • If the CI config itself needs changes, explain what each change does.
  • Check if the failure is in a recently changed file or if it is a pre-existing issue.
  • If secrets or environment variables are missing, identify which ones and explain where to set them without exposing their values.

Copy this into ~/.claude/skills/ci-debug/SKILL.md to use it as a slash command in Claude Code.

get crystl