Documenting Failure Scenarios
Document what happens when things go wrong. Failure scenarios help teams prepare for incidents and build resilient systems.
Every system fails eventually. Networks partition. Databases go down. Third-party services time out. The difference between chaos and controlled response is preparation. Failure scenarios document what happens when components fail and how the system should respond.
Adding a Failure Scenario Block
Failure Scenario Fields
Impact Examples
| Scenario | Impact Description |
|---|---|
| Write failure | All write operations fail. Users cannot save changes. |
| Checkout blocked | Checkout flow blocked. Revenue loss during outage. |
| Read degradation | Read operations degrade to cached data. Users see stale information. |
Severity Levels
This example uses the NeoArc Failure Scenario content block.
Scenario Status
Track the validation state of each scenario:
This example uses the NeoArc Failure Scenario content block.
This example uses the NeoArc Failure Scenario content block.
Visualising Failure Dependencies
Graph diagrams help visualise how failures cascade through a system. This example shows component dependencies with failure severity indicated by colour: red for critical, orange for high, green for medium, and blue for low-impact components. External dependencies are shown in indigo.
Categories of Failure Scenarios
Consider documenting scenarios in these categories:
Using Failure Scenarios
Failure scenarios are living documents. They should be:
| Activity | Description |
|---|---|
| Reviewed during design | Think through failures before building |
| Updated after incidents | Real incidents reveal gaps in documentation |
| Referenced during on-call | Engineers should know where to find them |
| Tested periodically | Chaos engineering validates that documented behaviour matches reality |
The failure scenarios you see in this documentation site were created using the same blocks you will use to document your own system resilience.