Against RL: The Case for System 2 Learning

Andreas Stuhlmüller — 2025-01-30 — Elicit — Elicit Blog

Summary

Argues that reinforcement learning is fundamentally unsafe for superintelligent systems because it relies on ‘System 1 learning’ (fast, intuitive updates), and proposes developing ‘System 2 learning’ methods that deliberately reason about belief updates from data, though technical details remain unspecified.

Source