Imitation learning is probably existentially safe
Michael K. Cohen, Marcus Hutter — 2025-11-21 — University of California, Berkeley, Australian National University — AI Magazine
Summary
Argues that advanced imitation learners are unlikely to cause human extinction by presenting rebuttals to six technical arguments claiming imitation learning poses existential risk, including arguments about goal-directed subagents, deceptive alignment, and means-end reasoning.
Source
- Link: https://onlinelibrary.wiley.com/doi/10.1002/aaai.70040?af=R
- Listed in the Shallow Review of Technical AI Safety 2025 under 1 agenda(s):
- behavior-alignment-theory — Theory / Corrigibility