LLM AGI will have memory, and memory changes alignment
Seth Herd — 2025-04-04 — LessWrong
Summary
Argues that LLM-based AGI will likely have memory systems enabling learning during deployment, and that this learning can functionally change alignment through accumulation of new beliefs and goals. Proposes empirical investigation of belief and value evolution in LLMs with memory.
Source
- Link: https://lesswrong.com/posts/aKncW36ZdEnzxLo8A/llm-agi-will-have-memory-and-memory-changes-alignment
- Listed in the Shallow Review of Technical AI Safety 2025 under 1 agenda(s):
- behavior-alignment-theory — Theory / Corrigibility