AI Safety Atlas Ch.4 — Governance Problems
Source: Governance Problems
AI governance differs fundamentally from traditional technology regulation because AI is simultaneously a general-purpose technology, an information processor, and potentially an intelligent system — creating challenges without precedent. See governance-problems.
Why Traditional Governance Fails
Conventional approaches (pharmaceutical clinical trials, nuclear treaties, facility monitoring) assume:
- Predictable development paths
- Clear, narrow applications
- Controllable physical infrastructure
AI breaks all three:
- General-purpose nature — one system reshapes healthcare, finance, transportation, education simultaneously; sector-specific regulation is insufficient
- Information capabilities — generates and manipulates content at scales conventional frameworks weren’t designed for
- Intelligence dimension — sufficiently capable systems may evade controls or pursue unintended objectives, a dynamic without precedent
Three Fundamental Problems
1. Unexpected Capabilities
Foundation models exhibit emergent abilities appearing suddenly with scale. GPT-3 unexpectedly performed arithmetic; later models showed unanticipated reasoning. Current evaluations cannot “guarantee absence of unknown threats, forecast new emergent abilities, or assess risks from autonomous systems.” Testing best practices remain nascent.
2. Deployment Safety
Once released, AI gets repurposed beyond intended uses. Same dialogue model → misinformation → cyberattacks. Jailbreaks bypass safety measures. Autonomous AI agents amplify these risks by chaining capabilities over extended periods, making post-deployment behavior increasingly difficult to predict.
3. Proliferation
“AI models are patterns of numbers instantly copied and transmitted globally.” Model weights spread via cyberattacks, insider leaks, rapid replication. Open-source ChatGPT-clones strip safety features and reveal dangerous capabilities. Model distillation extracts capabilities without weight access. Containment is fundamentally impossible — unlike nuclear materials.
Effective Governance Targets
Successful intervention points need three properties:
- Measurability — compute used in training measurable in FLOPs (clear thresholds, compliance monitoring)
- Controllability — practical mechanisms must exist (semiconductor supply-chain chokepoints enable export controls)
- Meaningfulness — must address fundamental capability/risk drivers (regulating UIs ≠ preventing emergent capability)
Key intervention points span the development pipeline:
- Early: compute infrastructure, training data
- During: safety frameworks, monitoring systems
- Deployment: monitoring controls, post-deployment measures
Connection to Wiki
This subchapter provides the analytical foundation for the rest of Ch.4. The “three target properties” (measurable + controllable + meaningful) is the criterion for evaluating any governance intervention. It also frames why the chapter prioritizes:
- compute-governance — meets all three criteria
- data-governance — partial: meaningful but harder to measure/control
It deepens existing wiki pages:
- ai-governance — adds the structural-difficulty argument
- risk-amplifiers — proliferation and emergent capabilities are amplifiers
- atlas-ch1-capabilities-03-leveraging-scale — emergent capabilities and BNSL are why governance struggles