How to evaluate control measures for LLM agents? A trajectory from today to superintelligence

Tomek Korbak, Mikita Balesni, Buck Shlegeris, Geoffrey Irving — 2025-04-07 — Redwood Research, Google DeepMind — arXiv

Summary

Proposes a systematic framework for adapting control evaluation procedures to advancing AI capabilities, defining five AI Control Levels (ACLs) with corresponding evaluation rules, control measures, and safety cases for each capability profile from current systems to superintelligence.

Source