UK AISI’s Alignment Team: Research Agenda

Benjamin Hilton, Jacob Pfau, Marie_DB, Geoffrey Irving — 2025-05-07 — UK AI Safety Institute — LessWrong

Summary

UK AISI’s Alignment Team presents their research agenda focused on developing safety case sketches to identify gaps in alignment approaches, with initial emphasis on training honest AI systems through scalable oversight and asymptotic guarantees.

Source

Link: https://lesswrong.com/posts/tbnw7LbNApvxNLAg8/uk-aisi-s-alignment-team-research-agenda
Listed in the Shallow Review of Technical AI Safety 2025 under 1 agenda(s):
- asymptotic-guarantees — Theory

asymptotic-guarantees

AI Safety Compendium

Explorer

UK AISI's Alignment Team: Research Agenda

UK AISI’s Alignment Team: Research Agenda

Summary

Source

Graph View

Graph view

Table of Contents

AI Safety Compendium

Explorer

UK AISI's Alignment Team: Research Agenda

UK AISI’s Alignment Team: Research Agenda

Summary

Source

Related Pages

Graph View

Graph view

Table of Contents