AI Safety Atlas Ch.1 — Forecasting Timelines
Source: Forecasting Timelines | ai-safety-atlas.com/chapters/v1/capabilities/forecasting-timelines/
Predicting when AI automates cognitive labor determines which safety strategies are viable. “The difference between 10 years and 50 years fundamentally changes which safety strategies are viable and which risks demand immediate attention.” If TAI arrives by 2030, no time for slow theoretical research. If 2050, breathing room for foundational alignment research, governance frameworks, and trustworthy evaluations.
How to Read Forecasts
The chapter takes an explicit epistemological stance: forecasts are not concrete predictions; they are scenarios for checking belief consistency. “If you think investment grows 5x per year AND software efficiency improves 3x per year AND hardware scales 15% per year AND each order of magnitude automates at least 10% of tasks, then what does that imply?” If the implied future surprises you, one input belief needs updating.
This is a calibration tool, not a crystal ball. “All models are wrong, but some are useful.”
Current Trajectory (Mid-2025)
- Training compute: 5× per year since 2020
- Dataset sizes: 3.7× per year
- Training costs: 3.5× per year
Concrete reference point: training Grok-4 (mid-2025) cost ~$480 million all-in. By 2030, current trajectory implies single training runs costing hundreds of billions of dollars and gigawatts of power — comparable to a small city.
These aren’t impossible scales — “the question is whether the economic incentives justify it and whether the infrastructure constraints (chips, power, data) can be overcome in time.”
The Compute → Automation Feedback Loop
Each compute order-of-magnitude automates a fraction of cognitive tasks. Once automated tasks generate enough economic value, the loop accelerates: compute → automation → productivity → more compute investment → more automation. This continues until full labor automation. The path runs through three resources: compute, data, and the economic returns funding both.
Massive Uncertainty: Biological Anchors
Even if compute-based scaling laws hold, how much compute TAI actually requires is deeply uncertain. The biological-anchors framework treats the human brain as a proof of concept and offers two reference points:
- Lower bound — compute needed to learn skills during a human lifetime: ~10²⁸ FLOP
- Upper bound — compute used by evolution to shape humans: ~10⁴¹ FLOP
That’s twelve orders of magnitude of uncertainty. “It’s like not knowing if something costs one dollar or a trillion dollars. When someone says ‘AGI in 2045,’ they mean ‘somewhere in the 2030s–2050s range, with 2045 as a rough center’.”
This is a critical caveat the wiki should propagate: even Aschenbrenner’s confident OOM-extrapolation has wide error bars on the target compute level, even if the rate of progress is well-measured.
Effective Compute
The chapter formalizes effective-compute as a multiplicative quantity:
Effective compute = Software efficiency × Hardware efficiency × Number of chips
Each factor improves independently. As of 2025:
- Hardware efficiency — GPU performance grows 1.35× per year
- Algorithmic efficiency — software cuts the compute needed for a given result by ~3× per year
- Chip production — AI chip production grew 2.3× per year since 2019
The total compounds much faster than any single input. This is the core variable underlying scaling-law extrapolation.
Training Data: The Approaching Wall
Dataset sizes for language models grew 3.7× per year since 2010. The internet has roughly 500 trillion tokens of high-quality public text. Largest 2024 models train on ~15 trillion tokens. At 4× per year, we exhaust high-quality public text data between 2026 and 2032.
Three escape routes:
- Multimodal data — ~10 trillion images and ~10 trillion seconds of video on the internet. Could 3–10× the effective data supply if encoded efficiently.
- Synthetic data — AI generating its own training data. Removes the constraint if outputs are high-quality enough. Post-training RL on reasoning tasks already demonstrates this works in some domains.
- Task-based learning / self-play — RL through environment interaction. Works for any task formalizable as a game with explicit rules and success metrics. AlphaZero learned superhuman strategies without any human gameplay examples. Speculative for arbitrary tasks but in principle data-free.
Whether scaling continues through 2030 depends heavily on how well these alternatives substitute for raw human-generated text.
Connection to Wiki
This subchapter complements existing wiki content:
- scaling-laws — the wiki’s existing page covered Aschenbrenner’s OOM framework. This subchapter adds the formal effective-compute decomposition (software × hardware × chips) and the data-wall analysis.
- situational-awareness — the Atlas extends Aschenbrenner’s compute-based projection forward to mid-2025 with concrete numbers (Grok-4 at $480M, the 4× per year data trajectory).
- summary-bostrom-ai-expert-survey — the biological-anchors uncertainty framing complements the Müller & Bostrom 2014 expert-survey range (50% HLMI by 2040).
- ai-population-explosion — the data-wall analysis shapes what kind of AI population can be cheaply replicated.
The subchapter justifies a new effective-compute concept page in the wiki.