AI Safety Levels (ASL)

AI Safety Levels (ASL) are Anthropic’s framework for standardized capability tiers demanding increasingly rigorous safety measures. The structure is the operational backbone of Anthropic’s RSP and inspired by biosafety levels used for infectious disease research — increasingly dangerous pathogens require increasingly stringent containment.

The Levels

Level	Description
ASL-1	Systems posing no meaningful catastrophic risk
ASL-2	Early dangerous capability signs (e.g., bioweapon assembly info, but not exceeding what search engines provide)
ASL-3	Substantially increases catastrophic misuse risk vs. non-AI baselines OR shows low-level autonomous capabilities
ASL-4	Not yet defined — qualitative escalation in misuse potential and autonomy
ASL-5+	Not yet defined — distance from current systems precludes specification

The undefined upper levels are intentional: ASL-4/5 will involve qualitative jumps, not just quantitative escalations of ASL-3.

How ASLs Trigger Action

Each level corresponds to required safeguards:

ASL-1 — basic safety measures
ASL-2 — capability-evaluation requirements + standard security
ASL-3 — comprehensive security and deployment restrictions; weight-protection scaling

When models cross ASL thresholds, additional safeguards must be in place — the if-then-commitments pattern operationalized.

Required Evaluation Categories

ASLs require multiple evaluation categories working together:

Capability evaluations — detect dangerous abilities (autonomous replication, CBRN, cyberattack)
Safety evaluations — verify control measures remain effective

Both must function collaboratively. Passing capability evaluations is insufficient if safety evaluations indicate concerns.

Comparison with Other Frameworks

The ASL approach has parallels in:

OpenAI Preparedness Framework — uses risk categories (cybersecurity, persuasion, autonomous replication) with low-to-critical spectrums rather than fixed levels
DeepMind Frontier Safety Framework — uses Critical Capability Levels (CCLs) per domain (bio, cyber, autonomy) rather than unified levels

Framework	Structure	Threshold-trigger pattern
Anthropic ASL	Unified levels (ASL-1 → 5+)	Cross level → required safeguards
OpenAI PF	Per-category risk (low → critical)	Pre-/post-mitigation evaluation
DeepMind FSF	Per-domain CCLs	Domain-specific mitigation combinations

All three are governance variants of the same underlying pattern: gated scaling based on evaluation results.

The Biosafety Analogy

The biosafety analogy is pedagogically useful:

BSL-1 — minimal risk, basic procedures
BSL-2 — moderate risk, restricted access
BSL-3 — serious disease, biocontainment lab
BSL-4 — life-threatening, maximum biocontainment

The pattern: standardized risk classification + standardized containment requirements + clear escalation triggers. ASL applies the same structure to AI capabilities.

The analogy has limits. AI capabilities are harder to characterize than pathogens (which have stable biological properties). AI moves up the ladder unpredictably while pathogens stay where they are.

Connection to Wiki

responsible-scaling-policy — RSP is the document operationalizing ASL
evaluation-frameworks — ASL is one governance framework
frontier-safety-frameworks — the multi-company cluster ASL belongs to
if-then-commitments — the operational pattern
capability-evaluations — the evidence base for ASL classification
anthropic — the org behind ASL

Sources cited

Primary URLs harvested from this page’s summary references. Auto-generated by scripts/backfill_citations.py; edit by re-running, not by hand.

AI Safety Atlas Ch.5 — Evaluation Frameworks — referenced as [[atlas-ch5-evaluations-07-evaluation-frameworks]]

AI Safety Compendium

Explorer

AI Safety Levels (ASL)

AI Safety Levels (ASL)

The Levels

How ASLs Trigger Action

Required Evaluation Categories

Comparison with Other Frameworks

The Biosafety Analogy

Connection to Wiki

Sources cited

Graph View

Graph view

Table of Contents

Backlinks

AI Safety Compendium

Explorer

AI Safety Levels (ASL)

AI Safety Levels (ASL)

The Levels

How ASLs Trigger Action

Required Evaluation Categories

Comparison with Other Frameworks

The Biosafety Analogy

Connection to Wiki

Related Pages

Sources cited

Graph View

Graph view

Table of Contents

Backlinks