Major Incident Procedure
P1 Declaration Criteria
Declare a Major Incident (P1) when any of the following is true:
- Complete outage of a Tier 1 service (defined in service catalogue)
- More than 50% of users unable to access a critical business service
- Security breach or suspected data exfiltration
- Regulatory reporting obligation triggered (GDPR, NCA, TDRA)
- Customer-facing service degraded for > 30 minutes with no ETA
Declaration authority: IT Service Delivery Manager or on-call manager.
War Room Setup
### War Room Checklist
- [ ] Bridge line open: [conference number / Teams link]
- [ ] Incident record created: INC[number]
- [ ] Major Incident Manager assigned
- [ ] Resolver groups joined (L2/L3 leads)
- [ ] Communications Manager assigned
- [ ] Management bridge established (separate from technical bridge)
- [ ] Monitoring dashboard shared
- [ ] Vendor support engaged (if applicable): Ticket [number]Communication Templates
Initial Notification (T+15 minutes)
Subject: [P1 ACTIVE] [Service Name] — Major Incident — INC[number]
We are currently investigating a major incident affecting [Service Name].
Status: INVESTIGATING
Impact: [Description of impact]
Users affected: [Estimate]
Start time: [HH:MM TZ]
Next update: [HH:MM TZ] or on status change.
Incident Manager: [Name] — [Contact]Hourly Update
Subject: [P1 UPDATE — T+Xh] [Service Name] — INC[number]
Time since declaration: [X hours Y minutes]
Current status: [INVESTIGATING / IDENTIFIED / IMPLEMENTING FIX]
Timeline:
- [HH:MM] — Issue first detected
- [HH:MM] — Root cause identified: [brief description]
- [HH:MM] — Fix being implemented
ETA to resolution: [HH:MM] or UNKNOWN
Actions in progress:
1. [Action being taken]
2. [Action being taken]Resolution Notification
Subject: [P1 RESOLVED] [Service Name] — INC[number]
The major incident affecting [Service Name] has been resolved.
Resolution time: [HH:MM TZ]
Total duration: [X hours Y minutes]
Root cause (preliminary): [One sentence]
Service status: FULLY RESTORED
A Post-Incident Review will be conducted within 5 business days.Post-Incident Review Template
# Post-Incident Review — INC[number]
**Date of incident**: [Date]
**Review date**: [Date — within 5 business days]
**Facilitator**: [Problem Manager]
**Attendees**: [List]
## Incident Timeline (chronological)
| Time | Event |
|------|-------|
| [HH:MM] | [What happened] |
## Root Cause Analysis
**Immediate cause**: [What directly caused the failure]
**Contributing factors**: [What conditions allowed it to happen]
**Root cause**: [The underlying systemic issue]
*Method used*: [ ] 5 Whys [ ] Fishbone [ ] Fault Tree
## Impact Assessment
- Duration: [X hours Y minutes]
- Users affected: [Number]
- Business impact: [Description]
- SLA breach: [Yes/No] — [SLA name and miss margin]
## Action Items
| # | Action | Owner | Due Date | Status |
|---|--------|-------|----------|:------:|
| 1 | [Corrective action] | [Name] | [Date] | Open |
| 2 | [Preventive action] | [Name] | [Date] | Open |
## Problem Record
- Problem created: [ ] Yes — PRB[number] [ ] No — Reason: [reason]