The ResOps Readiness Assessment: 10 Questions to Gauge Your Survivability

Scoring: 

  • 0–4 "Yes": Crisis Impending. Your "Uptime" is a mask for deep systemic fragility.

  • 5–7 "Yes": Fragile Foundation. You have the tools, but lack the strategic orchestration.

  • 8–10 "Yes": Resilience Leader. You are ready for the Black Swan.

Phase 1: Visibility & Mapping

  • 1. The N-th Party Map: Do we have a live, automated map of not just our direct vendors, but the infrastructure they rely on (e.g., knowing which of our SaaS tools share the same AWS region)?

  • 2. Shadow Discovery: Can we identify every external API call made by our critical path services within 60 seconds of a failure?

Phase 2: Tactical Execution (The Playbook)

  • 3. MTRL vs. MTTR: Do we measure Mean Time to Recovery of Logic (how fast the business functions) separately from how fast the tech is "fixed"?

  • 4. Unannounced Chaos: Have we successfully executed a "Chaos Experiment" in production within the last 90 days without causing an unintended customer outage?

  • 5. Multi-Vector Scenarios: Does our Disaster Recovery plan include a scenario where a technical failure happens simultaneously with a human crisis (e.g., a cyber attack during a holiday or a regional power outage)?

Phase 3: Culture & Governance

  • 6. The Veto Power: Does our ResOps lead (or equivalent) have the formal authority to "veto" a high-speed feature release if it exceeds our established Resilience Debt threshold?

  • 7. Neutral Reporting: Does the person responsible for resilience report to a Risk or Operations executive (CRO/COO) rather than a person incentivized primarily by shipping speed (CTO/VP Eng)?

  • 8. Impact Tolerance: Have we defined the "Maximum Tolerable Period of Disruption" for our top three revenue-generating services in terms of dollars, not just "9s"?

Phase 4: The Remediation Loop

  • 9. P0 Resilience: Are vulnerabilities found during "Game Days" automatically converted into P0/P1 tickets that take priority over the new feature roadmap?

  • 10. Graceful Degradation: Can our core business transaction (e.g., checkout, data upload, search) survive a total failure of its primary database by switching to a "Read-Only" or "Cached" mode?

Comments

Popular posts from this blog

I Took My Own Advice in an Interview. Pure Storage Didn't Flinch.

If I do the homework, you owe me a phone call. The death of decency in hiring.

The One Question That Terrifies Candidates But Wins Offers - It's not "How's the Culture?"