AI Safety Atlas Ch.1 — Defining and Measuring AGI
Source: Defining and Measuring AGI | ai-safety-atlas.com/chapters/v1/capabilities/defining-and-measuring-agi/
Progress on safety requires clear definitions and measurement criteria. This subchapter argues that, for safety planning, the field should focus on what systems can actually do — breaking intelligence down into specific, measurable capabilities — rather than chasing definitional debates about consciousness or “true” understanding.
Five Historical Approaches
The authors briefly assess five competing frameworks and explain why each falls short for safety purposes:
- Turing’s behavioral test — focus on observable behavior. Modern systems (GPT-4 passes conversation tests but struggles with spatial reasoning) expose its limits, but the core intuition — observable capabilities matter, not internal states — remains.
- Consciousness-based (Searle’s Chinese Room) — consciousness is harder to define than intelligence and largely irrelevant to safety. Whether a system is conscious matters less than whether it can perform dangerous tasks.
- Goal achievement (Legg & Hutter) — “intelligence measures an agent’s ability to achieve goals in a wide range of environments” captures intuition but lacks practical measurement criteria.
- Learning efficiency — adaptability matters less than final capability for safety. A risky system poses risks regardless of how it learned.
- Psychometric tradition (CHC theory) — the Cattell-Horn-Carroll framework breaks intelligence into measurable cognitive domains. The authors adopt this as their backbone.
The Capability×Generality Framework
The Atlas’s core proposal: measure AGI on two continuous dimensions.
- Capability (depth) — how well a system executes specific tasks, from 0% (cannot perform) through expert-human (80–90th percentile) to superhuman (>100%).
- Generality (breadth) — the percentage of cognitive domains where the system reaches expert level.
Together they enable precise descriptions: “outperforms 85% of humans in 30% of cognitive domains.”
Ten Cognitive Domains
From CHC theory, the framework identifies ten measurable capabilities: general knowledge, reading/writing, mathematics, on-the-spot reasoning, working memory, long-term memory storage, long-term memory retrieval, visual processing, auditory processing, and speed.
Threshold Definitions
This produces clean definitions for the field’s commonly-conflated terms:
- ANI (Artificial Narrow Intelligence) — task-specific expertise, basic to superhuman in single domains. Chess engines, Go programs.
- AGI (Artificial General Intelligence) — well-educated-adult-level versatility: 80–90th percentile across 80–90% of cognitive domains.
- TAI (Transformative AI) — capable of triggering economic/social transitions on the scale of the agricultural or industrial revolutions. Either moderate capability (60th percentile) across many economic tasks, or 99th-percentile capability in critical domains like ML research.
- ASI (Artificial Superintelligence) — superhuman capability (>100%) across 95%+ of cognitive domains.
This framework gives the wiki’s existing transformative-ai page a precise threshold definition it previously lacked.
Autonomy Levels (Separate from Capability)
A key safety insight: capability and deployment autonomy must be considered separately. A highly capable system deployed as a tool poses different risks than the same system deployed as an agent. See ai-autonomy-levels:
- L0 — No AI / pure human operation
- L1 — AI as Tool: suggests; humans decide
- L2 — AI as Consultant: advises; humans direct
- L3 — AI as Collaborator: equal partnership
- L4 — AI as Expert: handles execution with oversight
- L5 — AI as Agent: fully autonomous, minimal oversight
This connects directly to the wiki’s ai-agents and ai-control pages — control becomes harder as autonomy rises.
Alternative Frameworks
The text acknowledges the (t,n)-AGI framework — capability defined by matching n experts working for duration t. METR researchers operationalized this by measuring task completion time horizons, finding current systems handle short professional tasks but struggle with multi-day projects requiring sustained context. See metr.
Critical Limitations
The authors flag honest weaknesses of their own framework:
- Anthropocentric bias — human-centered taxonomies like CHC may miss capabilities AI lacks but humans don’t fully recognize as cognitive.
- Misleading linearity — percentage scores (“57% AGI”) suggest linear progress when remaining capabilities may be disproportionately difficult.
- Score-perception divergence — an LLM scoring 90% on matrices tests behaves differently than humans achieving similar scores.
The field continues evolving on which cognitive abilities matter most and how to weigh them — but the capability–generality framework is foundational for safety discussions.
Connection to Wiki
This subchapter gives the wiki’s existing concepts (transformative-ai, ai-population-explosion, ai-control) a clean operational definition layer. It also justifies creating dedicated pages for agi-definitions-and-thresholds (the four-tier ANI/AGI/TAI/ASI scheme) and ai-autonomy-levels (the L0–L5 framework).