AI Safety Atlas Ch.2 — Appendix: Quantifying Existential Risks
Source: Appendix: Quantifying Existential Risks
Expert subjective probability estimates for AI causing existential catastrophe — the P(doom) metric and its limitations. See p-doom for the dedicated concept page.
What P(doom) Means
P(doom) = subjective probability that AI causes existentially catastrophic outcomes for humanity. Generally encompasses extinction, permanent disempowerment, or civilizational collapse. No standardized methodology — each estimate reflects individual subjective assessment of timelines, alignment difficulty, governance, failure modes.
The metric has evolved from informal forum slang into a serious metric used by researchers, policymakers, industry leaders.
Why It’s Inherently Uncertain
Three structural challenges:
- No historical data — unlike most risk assessments, no empirical base rate exists
- Reliance on theoretical arguments and expert judgment about scenarios that have never occurred
- No standard methodology — definitions of “doom” vary (extinction vs. disempowerment vs. catastrophe), timeframes vary, base assumptions vary
The Range of Expert Estimates
A 2023 survey: AI researchers’ mean estimate of extinction risk in 100 years = 14.4%. Individual estimates span almost the entire probability range:
| Researcher | P(doom) |
|---|---|
| Roman Yampolskiy | 99.9% |
| Eliezer Yudkowsky | >95% |
| Dan Hendrycks | >80% |
| Holden Karnofsky | 50% |
| Paul Christiano | 46% |
| Dario Amodei | 10–25% |
| Yoshua Bengio | 20% |
| Geoffrey Hinton | 10–20% |
| Elon Musk | 10–30% |
| Vitalik Buterin | 10% |
| Yann LeCun | <0.01% |
| Marc Andreessen | 0% |
This appendix is essentially a citable concern-level snapshot — useful as policy-and-advocacy reference. The key analytical insight is the substantial probability mass that knowledgeable experts place on catastrophic outcomes — including those who built the systems creating these risks.
Caveats
- Many experts don’t specify timeframes, making comparisons difficult
- “Doom” is defined inconsistently (extinction vs. permanent disempowerment vs. civilizational catastrophe)
- Estimates are highly sensitive to assumptions about timelines, alignment difficulty, and institutional response
- Subjective ≠ arbitrary — these are inputs to prioritization and policy, not pretensions to objectivity
Connection to Wiki
This appendix grounds the wiki’s existing positions:
- existential-risk — provides the metric framework
- ai-risk-arguments — debate often centers on whose P(doom) makes sense
- ben-garfinkel — Garfinkel’s skepticism implies a low P(doom) without formally giving a number
- 2501.04064v1 — Swoboda et al. respond to specific arguments that drive low P(doom) estimates
- eliezer-yudkowsky, paul-christiano, holden-karnofsky, yoshua-bengio, geoffrey-hinton — individual entity pages can cite their estimates from here
- atlas-ch1-capabilities-08-appendix-expert-surveys — complementary qualitative quote collection from the same researchers