Skip to main content

Checklists, Playbooks, Common Fixes

The lists you run before prod and every quarter thereafter.

C
Written by Catalin Fetean
Updated over 2 weeks ago

Audience: Owners, Admins, SRE, Support
Outcomes: Repeatable diligence; smooth incident handling

Incident playbooks (quick map)

  • Credential leak: force logout; rotate; notify; post-mortem

  • Webhook flood: throttle workers; ensure dedupe; drain queue

  • Duplicate payout: check release reference constraint; reconcile; issue refund/adjustment

  • SSE outage: disable proxy buffering; verify CORS/credentials; restart stream workers

Common fixes

  • CORS error → add exact origin (never * with credentials)

  • “Invalid signature” → ensure raw body + correct secret; check clock skew

  • “Pending forever” → replay webhook; handler logs; DLQ reprocess

  • Excess 401/403 → session expiry or KYC gate; policy logs

QA checklist

  • Dry-run incident drills at least once per quarter with timing captured.

  • Every P1 produces a written post-mortem with actions & owners.

Did this answer your question?