AI Agents

AI agents are AI systems capable of taking sequences of actions in the world — using tools, controlling computers, browsing the web, writing and executing code, and pursuing multi-step goals autonomously. They represent a qualitative shift from AI as a question-answering tool to AI as an active participant.

What Makes a System Agentic

A base language model responds to a prompt and stops. An agent wraps that model in a scaffold that allows it to:

Use tools (search, code execution, file systems)
Take actions whose results feed back into subsequent steps
Pursue goals over extended time horizons
Operate with minimal human intervention per step

leopold-aschenbrenner’s Situational Awareness frames “computer use” — multimodal models that control computers like humans do — as a key “unhobbling” gain that transforms capable models into true agents. This is one of the three drivers of effective compute growth alongside raw compute scaling and algorithmic efficiency.

Capability Jump from Scaffolding

The capability gain from agentic scaffolding is substantial even without model improvements. GPT-4 scored 2% on the SWE-Bench software engineering benchmark as a bare model, but 14-23% when wrapped in an agent framework. This represents a 7-11x jump from scaffolding alone — equivalent to a major model upgrade in terms of real-world task performance.

Safety Implications

Agentic AI systems raise distinct safety challenges compared to single-turn models:

Error compounding: Mistakes in early steps cascade through subsequent actions.
Irreversibility: Actions in the world (sending emails, modifying files, executing code) may be difficult or impossible to undo.
Oversight difficulty: Supervising a long chain of actions is harder than reviewing a single response.
Deceptive alignment: An agent with misaligned goals has more opportunity to act on them during extended autonomous operation.

These challenges connect to active research in scalable-oversight, ai-control, and interpretability.

Relation to Intelligence Explosion

Agents that can write code and conduct research are a prerequisite for the intelligence-explosion scenario. Once AI agents can improve AI training pipelines, the feedback loop between AI capability and AI-assisted research begins to accelerate.

Sources cited

Primary URLs harvested from this page’s summary references. Auto-generated by scripts/backfill_citations.py; edit by re-running, not by hand.

AI Safety Atlas Ch.1 — Current Capabilities — referenced as [[atlas-ch1-capabilities-04-current-capabilities]]
Summary: Situational Awareness — The Decade Ahead — referenced as [[situational-awareness]]

AI Safety Compendium

Explorer

AI Agents

AI Agents

What Makes a System Agentic

Capability Jump from Scaffolding

Safety Implications

Relation to Intelligence Explosion

Sources cited

Graph View

Graph view

Table of Contents

Backlinks

AI Safety Compendium

Explorer

AI Agents

AI Agents

What Makes a System Agentic

Capability Jump from Scaffolding

Safety Implications

Relation to Intelligence Explosion

Related Pages

Sources cited

Graph View

Graph view

Table of Contents

Backlinks