Mutual Assured AI Malfunction (MAIM)
Mutual Assured AI Malfunction (MAIM) is a proposed deterrence regime for ASI risk where unilateral attempts at ASI dominance trigger sabotage by rivals. The AI Safety Atlas (Ch.3.5) treats it as one of the four main asi-safety-strategies, specifically under deterrence approaches.
The Strategic Logic
MAIM resembles nuclear Mutual Assured Destruction (MAD) dynamics but applied to ASI development:
- Multiple actors monitor each other for ASI development progress
- If any single actor approaches decisive ASI capability, others trigger pre-arranged sabotage actions
- The credible threat of mutual sabotage disincentivizes unilateral racing
The aligning insight: MAIM creates incentives aligning national interests with global safety without requiring perfect cooperation — each actor has unilateral incentive to participate because defection is publicly sabotaged.
What MAIM Requires
- Detection — surveillance and monitoring sufficient to identify when an actor approaches ASI thresholds
- Escalation ladders — pre-defined responses that can be credibly triggered
- Strategic infrastructure placement — assets positioned to enable timely sabotage
- Credibility — actors must believe rivals will follow through
Limitations
The Atlas catalogs structural problems:
Detection Threshold Ambiguity
Unlike nuclear weapons (clear physical signatures: fissile material production, weapon assembly, testing), ASI development lacks clear detection thresholds. When does an AI training run become an ASI attempt? Compute thresholds are imprecise; capability evaluations are imperfect.
Distributed/Concealed Training
Technology could enable ASI development that’s harder to detect than nuclear weapons:
- Decentralized training over volunteer compute
- Concealed national programs
- Air-gapped facilities with off-grid power
Historical Precedent of Inaction
“Historical precedent suggests nations rarely escalate sufficiently for treaty enforcement.” MAD only worked because nuclear weapons are visibly catastrophic; ASI development can be framed ambiguously, and escalation may face domestic political opposition.
Distinction from Other ASI Strategies
| Strategy | Logic | Vulnerability |
|---|---|---|
| MAIM | Deter unilateral racing via threat of sabotage | Detection thresholds, escalation credibility |
| Global Moratorium (global-moratorium) | Halt AI development by agreement | Defection by single actors |
| MAIM vs. moratorium | Active deterrence vs. passive agreement | Different failure modes |
| Cooperation institutions (CERN, MAGIC) | Centralize and govern | Requires more cooperation than deterrence |
| Pivotal Acts (pivotal-act) | Use first aligned ASI to end risk period | Aligning the ASI in the first place |
The Yudkowsky Variant
Yudkowsky’s extreme position is sometimes framed as a hard MAIM: completely halt AI research, shut down GPU clusters, limit compute, enforced by military action if necessary to prevent catastrophic scenarios. This trades MAIM’s distributed deterrence for centralized enforcement.
Connection to Wiki
- asi-safety-strategies — MAIM is one of the four ASI strategies
- ai-governance — MAIM is the deterrence variant of ASI governance
- global-moratorium — alternative passive coordination
- pivotal-act — alternative end-the-risk-period strategy
- eliezer-yudkowsky — proponent of harder enforcement variants
- differential-development — MAIM raises the cost of unilateral capability racing
Related Pages
- asi-safety-strategies
- ai-governance
- global-moratorium
- pivotal-act
- eliezer-yudkowsky
- differential-development
- ai-safety-atlas-textbook
- atlas-ch3-strategies-05-asi-safety-strategies
Sources cited
Primary URLs harvested from this page’s summary references. Auto-generated by scripts/backfill_citations.py; edit by re-running, not by hand.
- AI Safety Atlas Ch.3 — ASI Safety Strategies — referenced as
[[atlas-ch3-strategies-05-asi-safety-strategies]] - AI Safety Atlas Ch.3 — Misuse Prevention Strategies — referenced as
[[atlas-ch3-strategies-03-misuse-prevention-strategies]]