⚙️ Process & Workflows — Problem Management
Problem Management Lifecycle
ITIL 4 Problem Management Workflow
Click any step to expand · 7 steps
1
🔍Problem Identification
2
📋Problem Logging & Categorisation
3
🔬Investigation & Diagnosis
4
🛡️Workaround DefinitionDECISION
5
✅Root Cause Confirmed
6
🔧Permanent Fix Implementation
7
🔒Problem Closure
5 Whys Template
The 5 Whys technique traces a problem back to its root cause by repeatedly asking "Why?".
Example: Email service intermittently unavailable
| Why # | Question | Answer |
|---|---|---|
| Why 1 | Why was email unavailable? | Mail server crashed |
| Why 2 | Why did the mail server crash? | Memory exhausted |
| Why 3 | Why was memory exhausted? | Memory leak in application update |
| Why 4 | Why was the memory leak not detected? | No memory monitoring alert configured |
| Why 5 | Why was no alert configured? | Alert configuration not part of deployment checklist |
Root cause: Incomplete deployment checklist missing monitoring configuration.
Permanent fix: Add "monitoring alert verification" as mandatory step in deployment checklist and standard change template.
Known Error Database (KEDB)
KEDB Entry Structure
| Field | Content |
|---|---|
| Problem ID | PR00001 |
| Title | Email service memory exhaustion |
| Affected Service | Email Platform |
| Symptoms | Intermittent bounce, slow send, server unresponsive |
| Root Cause | Memory leak in mail service v3.2.1 |
| Workaround | Restart mail-service process; monitor hourly |
| Workaround Owner | Unix Operations Team |
| Permanent Fix | Upgrade to v3.2.4 (planned in CR00512) |
| Status | Known Error — Workaround Available |
| Created | 2026-04-10 |
| Last Updated | 2026-04-28 |
KEDB Review Cycle
- Monthly: Verify workarounds still valid; update any that have changed
- On fix deployment: Validate workaround no longer needed; archive entry
- Quarterly: Purge entries older than 12 months with no related incidents
Proactive Problem Management
Proactive problem management identifies risks before they cause incidents:
| Technique | Frequency | Output |
|---|---|---|
| Incident trend analysis | Monthly | List of recurring categories for investigation |
| CMDB change velocity analysis | Weekly | High-risk CIs with frequent changes |
| Capacity threshold review | Monthly | CIs approaching resource limits |
| Vendor EOL calendar | Quarterly | Upcoming unsupported software/hardware |
| Security vulnerability scan correlation | Weekly | Unpatched CIs with known exploits |
ITIL 5: AI-Driven Predictive Problem Management
ITIL 5 elevates Problem Management with AI-driven prediction:
Continuous monitoring stream
→ AI anomaly detection (baseline deviation > 2σ)
→ Correlation with recent changes in CMDB
→ If pattern matches known failure signatures:
→ Auto-create Problem Record
→ Assign to relevant practice team
→ Suggest top 3 RCA hypotheses
→ Human validates and confirms root cause
→ AI generates KEDB draft for reviewBenefits:
- Detect problems before incidents occur (proactive at scale)
- Reduce MTTR by surfacing RCA hypotheses instantly
- Continuously improve prediction accuracy with feedback loops
KPIs
| Metric | Target |
|---|---|
| Problems resolved within target date | > 90% |
| Recurring incidents (same root cause) | < 5% month-over-month |
| Mean time to identify workaround | < 48 hours (P1 problems) |
| KEDB accuracy rate | > 95% |
| Problems identified proactively | > 20% of total |
| Knowledge articles created per problem | ≥ 1 |
Downloadable Resources
| Resource | Format | Download |
|---|---|---|
| Problem Register | Excel | ⬇ Download |
| Knowledge Management Process | Word | ⬇ Download |
← Back to Problem Management