AssistanceZero: Scalably Solving Assistance Games

Cassidy Laidlaw, Eli Bronstein, Timothy Guo, Dylan Feng, Lukas Berglund, Justin Svegliato, … (+2 more) — 2025-04-09 — UC Berkeley — arXiv

Summary

Presents AssistanceZero, the first scalable algorithm for solving assistance games by extending AlphaZero with neural networks that predict human actions and rewards, enabling planning under uncertainty about shared goals.

Key Result

AssistanceZero outperforms model-free RL and imitation learning in a Minecraft assistance game with over 10^400 possible goals, and significantly reduces participant actions in human studies.

Source

Link: https://arxiv.org/abs/2504.07091
Listed in the Shallow Review of Technical AI Safety 2025 under 1 agenda(s):
- assistance-games-assistive-agents — Black-box safety (understand and control current model behaviour) / Goal robustness

assistance-games-assistive-agents

AI Safety Compendium

Explorer

AssistanceZero: Scalably Solving Assistance Games

AssistanceZero: Scalably Solving Assistance Games

Summary

Key Result

Source

Graph View

Graph view

Table of Contents

AI Safety Compendium

Explorer

AssistanceZero: Scalably Solving Assistance Games

AssistanceZero: Scalably Solving Assistance Games

Summary

Key Result

Source

Related Pages

Graph View

Graph view

Table of Contents