UK AISI’s Alignment Team: Research Agenda

Benjamin Hilton, Jacob Pfau, Marie_DB, Geoffrey Irving — 2025-05-07 — UK AI Safety Institute — LessWrong

Summary

UK AISI’s Alignment Team presents their research agenda focused on developing safety case sketches to identify gaps in alignment approaches, with initial emphasis on training honest AI systems through scalable oversight and asymptotic guarantees.

Source