AGI Definitions and Thresholds
A precise framework — drawn from the AI Safety Atlas (Ch.1) — for defining the field’s commonly-conflated terms (ANI, AGI, TAI, ASI) using two continuous dimensions: capability and generality. Replaces fuzzy “AGI yes/no” debates with quantitative descriptions of system performance.
The Two Dimensions
- Capability (depth) — how well a system executes specific tasks. Range: 0% (cannot perform) → expert human (80–90th percentile) → superhuman (>100%).
- Generality (breadth) — the percentage of cognitive domains where the system reaches expert level.
A description like “outperforms 85% of humans in 30% of cognitive domains” is precise enough to coordinate research, regulation, and forecasting around — unlike “is it AGI yet?”
Ten Cognitive Domains
The framework draws on Cattell-Horn-Carroll (CHC) theory in psychometrics. The ten measurable cognitive domains:
- General Knowledge — facts across science, culture, history, common sense
- Reading and Writing Ability
- Mathematical Ability — arithmetic through calculus
- On-the-Spot Reasoning — flexible problem-solving on novel tasks
- Working Memory — maintaining and manipulating active information
- Long-Term Memory Storage — continuous learning of new information
- Long-Term Memory Retrieval — accessing stored knowledge while avoiding hallucinations
- Visual Processing
- Auditory Processing
- Speed — performing simple cognitive tasks quickly
The Four Thresholds
ANI — Artificial Narrow Intelligence
Task-specific expertise from basic to superhuman in single domains. Chess engines, Go programs, AlphaFold. Today’s state for many specialized systems.
AGI — Artificial General Intelligence
Well-educated-adult versatility: 80–90th percentile across 80–90% of cognitive domains. Critical: this is a high bar. Frontier LLMs in 2025 achieve high capability in some domains but lack breadth across all ten.
TAI — Transformative AI
Capable of triggering economic/social transitions on the scale of agricultural or industrial revolutions. The framework offers two profiles:
- Broad-moderate — 60th-percentile capability across many economically important tasks
- Narrow-superhuman — 99th-percentile capability in critical domains like automated ML research
The second profile matters because automating AI research itself could trigger an intelligence-explosion without requiring full AGI breadth.
ASI — Artificial Superintelligence
Superhuman capability (>100%) across 95%+ of cognitive domains. The endpoint of the intelligence-explosion in most scenarios.
Alternative Frameworks
The Atlas acknowledges competing approaches:
- (t,n)-AGI framework — capability defined by matching n experts working for duration t. METR researchers operationalized this as task completion time horizons — what duration of professional task can the system complete reliably? Current frontier systems handle short tasks well but struggle with multi-day projects requiring sustained context. See metr.
Critical Limitations
Honest caveats from the framework’s authors:
- Anthropocentric bias — CHC was developed for humans; AI may have capabilities humans lack but can’t recognize, and vice versa.
- Misleading linearity — “57% AGI” suggests linear progress when remaining capabilities may be disproportionately difficult. The last 10% of cognitive breadth could take 10× the work of the first 90%.
- Score-perception divergence — an LLM scoring 90% on matrices behaves differently than a human achieving the same score. Benchmark numbers don’t directly translate to functional equivalence.
Why This Matters for Safety
Concrete thresholds enable concrete arguments:
- TAI before AGI — the Atlas explicitly notes that narrow-superhuman ML-research capability could trigger transformative dynamics before broad AGI is reached. This is the intelligence-explosion thesis at lower required generality.
- Coordination on definitions — when ai-governance regulates “frontier models” or “AI systems posing severe risks,” operational definitions matter. Capability×generality offers an option more concrete than vibes.
- Forecasting clarity — debates about “when AGI” can be reframed as “when 90% capability across 90% of domains” — a question with measurable progress.
Connection to Wiki
This page operationalizes:
- transformative-ai — gives the existing concept page a precise threshold definition
- ai-population-explosion — the population-explosion thesis runs on the narrow-superhuman TAI variant
- intelligence-explosion — the trigger condition is automating ML research, not full AGI
- capability-evaluations — connects to dangerous-capability evaluations as the operational testing layer
- summary-bostrom-ai-expert-survey — Müller & Bostrom’s “HLMI” maps onto AGI in this framework
- situational-awareness — Aschenbrenner’s “expert-human” trajectory is roughly the AGI threshold
Related Pages
- transformative-ai
- intelligence-explosion
- ai-population-explosion
- ai-autonomy-levels
- capability-evaluations
- scaling-laws
- metr
- ai-safety-atlas-textbook
- atlas-ch1-capabilities-01-defining-and-measuring-agi
- summary-bostrom-ai-expert-survey
- situational-awareness
Sources cited
Primary URLs harvested from this page’s summary references. Auto-generated by scripts/backfill_citations.py; edit by re-running, not by hand.
- AI Safety Atlas Ch.1 — Defining and Measuring AGI — referenced as
[[atlas-ch1-capabilities-01-defining-and-measuring-agi]] - Summary: Situational Awareness — The Decade Ahead — referenced as
[[situational-awareness]]