Why Future AIs will Require New Alignment Methods
Alvin Ånestrand — 2025-10-10 — LessWrong
Summary
Introduces the concept of ‘alignment depth’ tied to task completion time, arguing that current alignment methods (HHH, deliberative alignment) that work for short tasks will be insufficient for AGI capable of completing longer tasks requiring different behavioral consistencies.
Source
- Link: https://lesswrong.com/posts/TxiB6hvnQqxXB5XDJ/why-future-ais-will-require-new-alignment-methods
- Listed in the Shallow Review of Technical AI Safety 2025 under 1 agenda(s):
- capability-evals — Evals