AGI Definitions and Thresholds

A precise framework — drawn from the AI Safety Atlas (Ch.1) — for defining the field’s commonly-conflated terms (ANI, AGI, TAI, ASI) using two continuous dimensions: capability and generality. Replaces fuzzy “AGI yes/no” debates with quantitative descriptions of system performance.

The Two Dimensions

  • Capability (depth) — how well a system executes specific tasks. Range: 0% (cannot perform) → expert human (80–90th percentile) → superhuman (>100%).
  • Generality (breadth) — the percentage of cognitive domains where the system reaches expert level.

A description like “outperforms 85% of humans in 30% of cognitive domains” is precise enough to coordinate research, regulation, and forecasting around — unlike “is it AGI yet?”

Ten Cognitive Domains

The framework draws on Cattell-Horn-Carroll (CHC) theory in psychometrics. The ten measurable cognitive domains:

  1. General Knowledge — facts across science, culture, history, common sense
  2. Reading and Writing Ability
  3. Mathematical Ability — arithmetic through calculus
  4. On-the-Spot Reasoning — flexible problem-solving on novel tasks
  5. Working Memory — maintaining and manipulating active information
  6. Long-Term Memory Storage — continuous learning of new information
  7. Long-Term Memory Retrieval — accessing stored knowledge while avoiding hallucinations
  8. Visual Processing
  9. Auditory Processing
  10. Speed — performing simple cognitive tasks quickly

The Four Thresholds

ANI — Artificial Narrow Intelligence

Task-specific expertise from basic to superhuman in single domains. Chess engines, Go programs, AlphaFold. Today’s state for many specialized systems.

AGI — Artificial General Intelligence

Well-educated-adult versatility: 80–90th percentile across 80–90% of cognitive domains. Critical: this is a high bar. Frontier LLMs in 2025 achieve high capability in some domains but lack breadth across all ten.

TAI — Transformative AI

Capable of triggering economic/social transitions on the scale of agricultural or industrial revolutions. The framework offers two profiles:

  • Broad-moderate — 60th-percentile capability across many economically important tasks
  • Narrow-superhuman — 99th-percentile capability in critical domains like automated ML research

The second profile matters because automating AI research itself could trigger an intelligence-explosion without requiring full AGI breadth.

ASI — Artificial Superintelligence

Superhuman capability (>100%) across 95%+ of cognitive domains. The endpoint of the intelligence-explosion in most scenarios.

Alternative Frameworks

The Atlas acknowledges competing approaches:

  • (t,n)-AGI framework — capability defined by matching n experts working for duration t. METR researchers operationalized this as task completion time horizons — what duration of professional task can the system complete reliably? Current frontier systems handle short tasks well but struggle with multi-day projects requiring sustained context. See metr.

Critical Limitations

Honest caveats from the framework’s authors:

  • Anthropocentric bias — CHC was developed for humans; AI may have capabilities humans lack but can’t recognize, and vice versa.
  • Misleading linearity — “57% AGI” suggests linear progress when remaining capabilities may be disproportionately difficult. The last 10% of cognitive breadth could take 10× the work of the first 90%.
  • Score-perception divergence — an LLM scoring 90% on matrices behaves differently than a human achieving the same score. Benchmark numbers don’t directly translate to functional equivalence.

Why This Matters for Safety

Concrete thresholds enable concrete arguments:

  • TAI before AGI — the Atlas explicitly notes that narrow-superhuman ML-research capability could trigger transformative dynamics before broad AGI is reached. This is the intelligence-explosion thesis at lower required generality.
  • Coordination on definitions — when ai-governance regulates “frontier models” or “AI systems posing severe risks,” operational definitions matter. Capability×generality offers an option more concrete than vibes.
  • Forecasting clarity — debates about “when AGI” can be reframed as “when 90% capability across 90% of domains” — a question with measurable progress.

Connection to Wiki

This page operationalizes:

  • transformative-ai — gives the existing concept page a precise threshold definition
  • ai-population-explosion — the population-explosion thesis runs on the narrow-superhuman TAI variant
  • intelligence-explosion — the trigger condition is automating ML research, not full AGI
  • capability-evaluations — connects to dangerous-capability evaluations as the operational testing layer
  • summary-bostrom-ai-expert-survey — Müller & Bostrom’s “HLMI” maps onto AGI in this framework
  • situational-awareness — Aschenbrenner’s “expert-human” trajectory is roughly the AGI threshold

Sources cited

Primary URLs harvested from this page’s summary references. Auto-generated by scripts/backfill_citations.py; edit by re-running, not by hand.