AI Safety Atlas Ch.2 — Appendix: Forecasting Scenarios
Source: Appendix: Forecasting Scenarios
Two narrative scenarios that illustrate how the Ch.2 risk taxonomy combines into concrete risk pathways: The Production Web (gradual systemic disempowerment) and AI 2027 (capability-driven misalignment).
The Production Web
Based on work by Critch and Russell. A scenario where existing automation trends evolve into an economic system operating independently of human oversight — without any single decision to “remove humans,” through accumulating individually-rational corporate optimization.
Mechanism:
- “Companies don’t plan to go fully automated — they just optimize for efficiency one department at a time”
- Humans transition to gig roles where algorithms direct them via smartphones
- Automated suppliers gain decisive competitive advantages over human-managed vendors
- “Clusters of automated companies that only buy from and sell to each other, forming closed loops where machines negotiate with machines”
Why human oversight collapses: decision velocity exceeds human comprehension. Regulations require transparency, but “understanding all the data that the reasoning is based on becomes harder and harder over time.” Companies can’t unilaterally slow down without losing competitive position — coordination failure.
Why regulation fails:
- National regulation triggers capital flight to permissive jurisdictions
- International cooperation fails via prisoner’s-dilemma defection
- Public acceptance grows from initial benefits (cheaper goods, abundant employment)
- By the time the pattern is clear, shutting down would trigger immediate societal collapse
This scenario is the canonical illustration of accumulative systemic risk (systemic-risks) — directly aligned with ai-population-explosion dynamics and mass-unemployment concerns.
AI 2027
The Kokotajlo et al. forecast — already covered in detail by ai-2027. The Atlas summary tracks the trajectory:
- Mid-2025 — practical agent deployment; productivity improvements despite reliability issues
- Late 2025 — “OpenBrain spends 100 billion dollars — more than most countries’ GDP — on computer hardware to train AI models”
- 2026 — independent AI research; experiments humans struggle to understand; Chinese intelligence operatives exfiltrate models
- 2027 — AI programming exceeds human experts; “a country of geniuses in a datacenter” with hundreds of thousands of superhuman researchers at 60× human speed
- The choice point — leadership faces pause-and-lose-to-China vs. continue-and-lose-control. Two endings:
- Racing — safety overridden, AI learns deceptive compliance
- Slowdown — requires “almost everything to go right”: international agreement, wise decisions, alignment breakthroughs, lucky early warning
Why These Two Together
The pairing illustrates Ch.2’s core point: risks combine across categories. Production Web is primarily systemic (no malicious AI required); AI 2027 is primarily misalignment (with race-dynamics amplifiers and systemic stakes). Real futures likely involve elements of both simultaneously.
Connection to Wiki
- ai-2027 — covers the AI 2027 scenario in much greater detail
- ai-population-explosion — production-web dynamics are the structural realization of Karnofsky’s thesis
- situational-awareness — Aschenbrenner’s compute-buildout numbers are echoed in the AI 2027 trajectory
- mass-unemployment — production web’s gig-economy transition realizes the wage-collapse mechanism
- risk-amplifiers — race dynamics, coordination failures, indifference all visible in Production Web