Summary: Optimal Timing for Superintelligence
Author: nick-bostrom Year: 2026 (Working paper, version 1.0) Source: bostrom-optimal-timing.pdf
Overview
This paper argues that developing superintelligence is not analogous to playing Russian roulette but is better compared to undergoing risky surgery for a condition that will otherwise prove fatal. Through a series of increasingly sophisticated mathematical models, Bostrom examines when it would be optimal to deploy superintelligent AI from the perspective of currently living people. The central conclusion is captured in the phrase “swift to harbor, slow to berth”: move quickly toward AGI capability, then pause briefly before full deployment to harvest front-loaded safety gains.
The Core Argument
Bostrom frames the choice not as “safe baseline vs. risky AI venture” but as a comparison between two risky trajectories:
- Without superintelligence: 170,000 people die every day from disease, aging, and other causes; humanity remains exposed to ongoing existential-risk from other technologies; widespread suffering continues among humans and animals.
- With superintelligence: Unprecedented risks from AI misalignment and other failure modes, but also the possibility of eliminating aging, curing all diseases, and unlocking extraordinary levels of flourishing.
This framing directly challenges calls for permanent AI moratoriums, such as those argued by eliezer-yudkowsky and nate-soares in their book If Anyone Builds It, Everyone Dies (2025). Bostrom counterargues: if nobody builds AGI, everyone also dies, just from the baseline causes that superintelligence could address.
The Models
The paper restricts itself to “mundane considerations” (setting aside simulation theory, digital minds, infinite ethics, etc.) and adopts a person-affecting perspective focused on currently living people (rather than an impersonal perspective weighing possible future generations).
1. Simple Go/No-Go Model
Assuming average remaining life expectancy of 40 years without superintelligence and ~1,400 years with it (based on reducing mortality to the rate of healthy 20-year-olds), developing superintelligence increases expected life expectancy provided the probability of AI-induced annihilation is below 97%. Even with very pessimistic “doomer” assumptions, the expected value calculation favors development.
2. Timing and Safety Progress
Moving beyond the binary choice, this model asks: given that we can reduce AI risk by waiting (through ai-alignment research), how long should we wait? The answer depends on the initial risk level and the rate of safety progress.
Key finding: both very fast and very slow rates of safety progress favor earlier launch. Fast progress means the risk drops quickly, so there is no need to wait long. Slow progress means waiting yields little benefit. It is intermediate-to-slow progress rates that produce the longest optimal delays. For most parameter settings, optimal delays are modest, typically a single-digit number of years.
3. Temporal Discounting
Adding a pure time preference (where future benefits are valued less) generally pushes toward later launch, since the vast long-term benefits of centuries of life extension are downweighted while near-term mortality risk retains full weight. However, this effect reverses when post-AGI life quality is sufficiently higher than pre-AGI life quality: impatience then penalizes delaying the onset of that better existence.
4. Quality-of-Life Adjustment
If post-AGI life is not merely longer but also better, the “launch immediately” region expands. However, this effect saturates: even infinitely large quality improvements cannot push all cases to immediate launch, because at some point pre-AGI life contributes almost nothing to expected value and the main concern becomes maximizing the chance of reaching the post-AGI era.
5. Diminishing Marginal Utility (Risk Aversion)
Using standard CRRA utility functions calibrated to health economics literature, risk aversion makes the decision-maker more conservative, shrinking the “launch immediately” zone and increasing optimal wait times. But even substantial risk aversion does not radically alter the overall picture.
6. Multiphase Model: “Swift to Harbor, Slow to Berth”
The most developed model distinguishes two phases:
- Phase 1: Time until AGI capability exists (largely driven by technical difficulty and competitive dynamics).
- Phase 2: Deliberate pause before full deployment.
Phase 2 is further divided into subphases with front-loaded safety gains:
- Phase 2a (weeks to months): Very rapid progress. Researchers can finally study the actual system, probe failure modes, implement oversight.
- Phase 2b (roughly 1 year): Fast but decelerating progress; low-hanging fruit picked.
- Phase 2c (several years): Progress returns to Phase 1 rates.
- Phase 2d (indefinite): Very slow fundamental research.
This front-loading means time early in Phase 2 buys more safety per unit than time during Phase 1 or late Phase 2. The optimal strategy often involves accelerating through Phase 1 to reach Phase 2 sooner, then pausing for months to a small number of years to capture the safety windfall before deploying.
7. Safety Testing as a POMDP
When system risk is uncertain, the relevant object is not a single optimal launch date but an optimal policy that conditions on evidence from safety tests. This is formalized as a partially observed Markov decision process (POMDP). The policy launches quickly when tests pass repeatedly (indicating a likely-safe system) and delays when tests fail (indicating danger). Safety testing always increases expected utility compared to a fixed-delay strategy.
Distributional Considerations
Bostrom examines who benefits from different timelines:
- The elderly, sick, and poor have the most to gain and least to lose from early deployment. Their expected lifespan under the status quo is short, and post-AGI quality of life would represent a larger relative improvement.
- The young and healthy can tolerate longer delays without risking death before AGI arrives.
Under a prioritarian social welfare function (where improving the welfare of the worst-off receives extra weight), the optimal timeline becomes shorter than under neutral utilitarianism. Bostrom also considers and rejects the “full cup” view (that life-years beyond roughly 70 contribute little value), arguing it conflates the deprivations of old age under current conditions with a fundamental limit on the value of life.
Theory of Second Best: Risks of Pausing
Even if an ideal pause would be beneficial, Bostrom catalogs numerous ways that calls for AI pauses or moratoria could backfire in practice:
- A pause occurs too early and is dismissed, reducing willingness to pause later when it matters
- Poorly designed regulation creates safety theater without reducing real risks
- Development shifts to less scrupulous actors or less cooperative states
- National security exemptions push AI into the military sector with less ai-alignment focus
- A pause creates hardware/algorithm overhangs leading to more explosive and dangerous progress when lifted
- The pause calcifies into permanent relinquishment, foreclosing the immense benefits of superintelligence
- Enforcement apparatus risks enabling stable-totalitarianism
- Agitation leads to polarization and extremism, destroying capacity for constructive dialogue
Bostrom distinguishes three types of pause: (1) a frontrunner unilaterally burning its lead (most attractive, self-limiting), (2) government-imposed moratorium (more failure modes), and (3) internationally agreed prohibition (highest risk of permanent relinquishment and surveillance state).
Conclusions
The central contributions are:
- The baseline is not safe. Individual mortality and other risks mean that delay has real costs, not just opportunity costs.
- Even very high catastrophe probabilities can be worth accepting when the alternative is certain death from aging within decades.
- Optimal delays are typically modest: months to a small number of years after AGI capability is reached.
- Safety progress is front-loaded in Phase 2, creating special value in a brief pause once AGI-capable systems exist.
- Prioritarian ethics shorten optimal timelines, since those worst off under the status quo benefit most from early deployment.
- Implementation matters enormously: poorly executed pauses can be worse than no pause at all.
The paper explicitly limits itself to mundane, person-affecting considerations, noting that arcane considerations (simulation theory, digital minds, etc.) and impersonal perspectives (weighing future generations) are left for future work.
Related Pages
- nick-bostrom
- existential-risk
- ai-safety
- ai-alignment
- ai-governance
- transformative-ai
- intelligence-explosion
- instrumental-convergence
- ai-takeover-scenarios
- stable-totalitarianism
- differential-development
- responsible-scaling-policy
- capability-evaluations
- scalable-oversight
- eliezer-yudkowsky
- nate-soares
- toby-ord
- future-of-humanity-institute
- academic-papers-index
- summary-bostrom-ai-expert-survey
- situational-awareness
- 80k-ai-risk