AIs at the current capability level may be important for future safety work
Ryan Greenblatt — 2025-05-12 — Anthropic — LessWrong
Summary
Argues that current-capability AI systems may remain important for future safety work because trusted models might not exceed current capabilities and compute constraints may drive automated safety research toward smaller models.
Source
- Link: https://lesswrong.com/posts/cJQZAueoPC6aTncKK/ais-at-the-current-capability-level-may-be-important-for
- Listed in the Shallow Review of Technical AI Safety 2025 under 1 agenda(s):
- control — Black-box safety (understand and control current model behaviour) / Iterative alignment