MoSSAIC: AI Safety After Mechanism

Matt Farr, Aditya Arpitha Prasad, Chris Pang, Aditya Adiga, Jayson Amati, Sahil K — 2025-07-01 — ODYSSEY 2025 Conference

Summary

Critiques the causal-mechanistic paradigm in AI safety (particularly mechanistic interpretability), argues it will fail as intelligence scales, and proposes MoSSAIC (Management of Substrate-Sensitive AI Capabilities) as a supplementary framework to address limitations when dealing with evasive intelligence.

Source