Imitation learning is probably existentially safe

Michael K. Cohen, Marcus Hutter — 2025-11-21 — University of California, Berkeley, Australian National University — AI Magazine

Summary

Argues that advanced imitation learners are unlikely to cause human extinction by presenting rebuttals to six technical arguments claiming imitation learning poses existential risk, including arguments about goal-directed subagents, deceptive alignment, and means-end reasoning.

Source

Link: https://onlinelibrary.wiley.com/doi/10.1002/aaai.70040?af=R
Listed in the Shallow Review of Technical AI Safety 2025 under 1 agenda(s):
- behavior-alignment-theory — Theory / Corrigibility

behavior-alignment-theory

AI Safety Compendium

Explorer

Imitation learning is probably existentially safe

Imitation learning is probably existentially safe

Summary

Source

Graph View

Graph view

Table of Contents

AI Safety Compendium

Explorer

Imitation learning is probably existentially safe

Imitation learning is probably existentially safe

Summary

Source

Related Pages

Graph View

Graph view

Table of Contents