AI Safety Atlas Ch.3 — Introduction

Source: Strategies — Introduction | Authors: Markov Grey & Charbel-Raphaël Ségerie | Updated Summer 2025 | 3 min

The chapter lays out the big picture of AI safety strategy to mitigate the risks explored in Ch.2, organized into three primary categories with a defense-in-depth philosophy.

Three Strategy Families

  • Misuse prevention — access controls and technical safeguards limiting harmful applications
  • AGI/ASI safety — alignment and control measures for advanced systems
  • Socio-technical interventions — governance, security, culture applicable across all categories

The thesis: “a comprehensive approach that combines many of these strategies” works better than isolated implementations — the defense-in-depth framework. See atlas-ch3-strategies-07-combining-strategies for the integrated four-step sequence.

Scope Limitations

Explicitly excluded — important to note for what the chapter does not address:

  • AI-generated misinformation and deepfakes (covered partially in epistemic-erosion)
  • Data privacy concerns
  • Standard cybersecurity practices
  • Bias and toxicity issues
  • AI welfare considerations
  • Capability gaps unrelated to misalignment

The focused scope reflects the safety community’s emphasis on existential and large-scale catastrophic risks from advanced, potentially agentic AI systems.

Connection to Wiki

Ch.3 maps onto and substantially deepens existing wiki strategy pages: