Serious Flaws in CAST
Max Harms — 2025-11-19 — MIRI
Source
- Link: https://www.lesswrong.com/s/KfCjeconYRdFbMxsy/p/qgBFJ72tahLo5hzqy
- Listed in the Shallow Review of Technical AI Safety 2025 under 1 agenda(s):
- behavior-alignment-theory — Theory / Corrigibility