Risk Decomposition
A two-dimensional framework — drawn from the AI Safety Atlas (Ch.2) — for categorizing AI risks. The first dimension is cause (why risks occur); the second is severity (how bad they get). Real-world risks usually combine multiple causes and span severity levels.
Causes — Three Categories
Risk classification by causal responsibility identifies intervention points.
Misuse Risks
Humans deliberately deploy AI to cause harm. The AI may function exactly as designed; human intent creates the risk. Examples:
- Bioweapon design and DNA-synthesis evasion
- AI-generated malware, deepfakes, prompt-injection attacks
- Autonomous weapons systems
- Large-scale disinformation campaigns
Substantially developed in the wiki’s biosecurity, autonomous-weapons, ai-military-applications pages.
Misalignment Risks
AI systems pursue goals different from human intentions. Three sub-mechanisms:
- Specification failures — wrong training signal
- Generalization failures — correct signal, wrong learned objective (goal-misgeneralization)
- Instrumental subgoals — self-preservation, power-seeking emerging from optimization
Treated in ai-alignment, deceptive-alignment, instrumental-convergence.
Systemic Risks
Emergent threats from AI integration with complex global systems. No single actor intends the harm; responsibility is diffuse. Examples:
- Power concentration via foundation-model centralization
- Mass unemployment from broad task automation
- epistemic-erosion from AI-generated content flooding information ecosystems
- value-lock-in when AI deeply embedded in society
See systemic-risks for the consolidated treatment.
Real Risks Combine Categories
Atlas analysis of 1,600+ documented AI risks shows most don’t fit cleanly in one category — multi-agent risks, misuse enabling misalignment, systemic pressures amplifying individual failures.
Severity — Three Levels
Individual / Local
Specific people or communities affected. The AI Incident Database documents 1,000+ cases — autonomous car crashes, hiring algorithm bias, privacy leakage, targeted misinformation. Already happening.
Catastrophic
Affects ~10% of global population; recovery possible. Historical reference points: Black Death (one-third of Europe), 1918 flu (50–100M deaths). AI-relevant scenarios: nation-scale infrastructure attacks, AI-enabled authoritarianism, sustained AI disinformation breaking shared reality.
Existential
Humanity could never recover its full potential. Cited definition (Bostrom 2001): “an adverse outcome would either annihilate Earth-originating intelligent life or permanently and drastically curtail its potential.” AI examples: ai-takeover-scenarios, stable-totalitarianism, direct extinction.
The irreversibility argument: existential outcomes preclude learning from failure. This justifies preventative attention to low-probability, high-impact scenarios. See near-term-harms-vs-x-risk for the strategic debate this generates.
Alternative Axes
The Atlas notes other useful classification axes (not used as primary but complementary):
- Who — humans vs. AI systems vs. emergent multi-agent dynamics
- When — development time vs. deployment time
- Intended vs. unintended outcomes
And two off-axis severity types in alternative-risk-categories:
- i-risks (ikigai) — humans survive but lose meaning
- s-risks (suffering) — astronomical suffering futures
Connection to Wiki
This page is the navigational schema for Ch.2 of the textbook and for the wiki’s risk-landscape pages:
- Misuse → atlas-ch2-risks-04-misuse-risks, biosecurity, autonomous-weapons
- Misalignment → atlas-ch2-risks-05-misalignment-risks, ai-alignment, deceptive-alignment
- Systemic → atlas-ch2-risks-06-systemic-risks, systemic-risks, ai-population-explosion
- Severity → near-term-harms-vs-x-risk, existential-risk
Related Pages
- ai-safety-atlas-textbook
- risk-amplifiers
- dangerous-capabilities
- systemic-risks
- ai-alignment
- existential-risk
- near-term-harms-vs-x-risk
- alternative-risk-categories
- atlas-ch2-risks-01-risk-decomposition
- atlas-ch2-risks-00-introduction
Sources cited
Primary URLs harvested from this page’s summary references. Auto-generated by scripts/backfill_citations.py; edit by re-running, not by hand.
- AI Safety Atlas Ch.2 — Introduction — referenced as
[[atlas-ch2-risks-00-introduction]] - AI Safety Atlas Ch.2 — Misalignment Risks — referenced as
[[atlas-ch2-risks-05-misalignment-risks]] - AI Safety Atlas Ch.2 — Misuse Risks — referenced as
[[atlas-ch2-risks-04-misuse-risks]] - AI Safety Atlas Ch.2 — Risk Decomposition — referenced as
[[atlas-ch2-risks-01-risk-decomposition]] - AI Safety Atlas Ch.2 — Systemic Risks — referenced as
[[atlas-ch2-risks-06-systemic-risks]]