AI Safety Atlas Ch.4 — Governance Problems

Source: Governance Problems

AI governance differs fundamentally from traditional technology regulation because AI is simultaneously a general-purpose technology, an information processor, and potentially an intelligent system — creating challenges without precedent. See governance-problems.

Why Traditional Governance Fails

Conventional approaches (pharmaceutical clinical trials, nuclear treaties, facility monitoring) assume:

Predictable development paths
Clear, narrow applications
Controllable physical infrastructure

AI breaks all three:

General-purpose nature — one system reshapes healthcare, finance, transportation, education simultaneously; sector-specific regulation is insufficient
Information capabilities — generates and manipulates content at scales conventional frameworks weren’t designed for
Intelligence dimension — sufficiently capable systems may evade controls or pursue unintended objectives, a dynamic without precedent

Three Fundamental Problems

1. Unexpected Capabilities

Foundation models exhibit emergent abilities appearing suddenly with scale. GPT-3 unexpectedly performed arithmetic; later models showed unanticipated reasoning. Current evaluations cannot “guarantee absence of unknown threats, forecast new emergent abilities, or assess risks from autonomous systems.” Testing best practices remain nascent.

2. Deployment Safety

Once released, AI gets repurposed beyond intended uses. Same dialogue model → misinformation → cyberattacks. Jailbreaks bypass safety measures. Autonomous AI agents amplify these risks by chaining capabilities over extended periods, making post-deployment behavior increasingly difficult to predict.

3. Proliferation

“AI models are patterns of numbers instantly copied and transmitted globally.” Model weights spread via cyberattacks, insider leaks, rapid replication. Open-source ChatGPT-clones strip safety features and reveal dangerous capabilities. Model distillation extracts capabilities without weight access. Containment is fundamentally impossible — unlike nuclear materials.

Effective Governance Targets

Successful intervention points need three properties:

Measurability — compute used in training measurable in FLOPs (clear thresholds, compliance monitoring)
Controllability — practical mechanisms must exist (semiconductor supply-chain chokepoints enable export controls)
Meaningfulness — must address fundamental capability/risk drivers (regulating UIs ≠ preventing emergent capability)

Key intervention points span the development pipeline:

Early: compute infrastructure, training data
During: safety frameworks, monitoring systems
Deployment: monitoring controls, post-deployment measures

Connection to Wiki

This subchapter provides the analytical foundation for the rest of Ch.4. The “three target properties” (measurable + controllable + meaningful) is the criterion for evaluating any governance intervention. It also frames why the chapter prioritizes:

compute-governance — meets all three criteria
data-governance — partial: meaningful but harder to measure/control

It deepens existing wiki pages:

ai-governance — adds the structural-difficulty argument
risk-amplifiers — proliferation and emergent capabilities are amplifiers
atlas-ch1-capabilities-03-leveraging-scale — emergent capabilities and BNSL are why governance struggles

AI Safety Compendium

Explorer

AI Safety Atlas Ch.4 — Governance Problems

AI Safety Atlas Ch.4 — Governance Problems

Why Traditional Governance Fails

Three Fundamental Problems

1. Unexpected Capabilities

2. Deployment Safety

3. Proliferation

Effective Governance Targets

Connection to Wiki

Graph View

Graph view

Table of Contents

Backlinks

AI Safety Compendium

Explorer

AI Safety Atlas Ch.4 — Governance Problems

AI Safety Atlas Ch.4 — Governance Problems

Why Traditional Governance Fails

Three Fundamental Problems

1. Unexpected Capabilities

2. Deployment Safety

3. Proliferation

Effective Governance Targets

Connection to Wiki

Related Pages

Graph View

Graph view

Table of Contents

Backlinks