Skip to main content

Rules, Signals, and Alert Hygiene

From noisy logs to actionable alerts with low false positives.

C
Written by Catalin Fetean
Updated over 2 weeks ago

Audience: SecOps, SRE
Outcomes: Detections that catch real issues, not every blip

High-value detections

  • Credential stuffing: multiple failed logins from new ASN → alert & temporary CAPTCHA.

  • Privilege change anomalies: role escalations outside maintenance window → page.

  • Webhook signature spike: potential key leak → rotate & quarantine.

  • Export spikes: potential exfiltration → freeze exports & review.

Example KQL/SPL (pseudocode)

where action == "auth.failed" | stats count() by ip, asn, 5m | where count > 20 and asn is new where action == "rbac.change" | where hour not in maint_window where action == "webhook.sig_fail" | stats count() by provider, 10m | where count > threshold

Alert hygiene

  • Page only when user-visible or money-impacting.

  • Everything else → ticket or Slack with cooldowns.

QA checklist

  • Simulate each rule; ensure one alert, not a storm.

  • Runbook link included in every alert.

Runbook: webhook secret leak

  1. Rotate secret;

  2. Temporarily ignore invalid signature alerts for old secret;

  3. Re-verify recent events;

  4. Post-mortem with timeline.

Did this answer your question?