MoSSAIC: AI Safety After Mechanism

Matt Farr, Aditya Arpitha Prasad, Chris Pang, Aditya Adiga, Jayson Amati, Sahil K — 2025-07-01 — ODYSSEY 2025 Conference

Summary

Critiques the causal-mechanistic paradigm in AI safety (particularly mechanistic interpretability), argues it will fail as intelligence scales, and proposes MoSSAIC (Management of Substrate-Sensitive AI Capabilities) as a supplementary framework to address limitations when dealing with evasive intelligence.

Source

Link: https://openreview.net/forum?id=n7WYSJ35FU
Listed in the Shallow Review of Technical AI Safety 2025 under 1 agenda(s):
- high-actuation-spaces — Theory

high-actuation-spaces

AI Safety Compendium

Explorer

MoSSAIC: AI Safety After Mechanism

MoSSAIC: AI Safety After Mechanism

Summary

Source

Graph View

Graph view

Table of Contents

AI Safety Compendium

Explorer

MoSSAIC: AI Safety After Mechanism

MoSSAIC: AI Safety After Mechanism

Summary

Source

Related Pages

Graph View

Graph view

Table of Contents