Shutdownable Agents through POST-Agency

Elliott Thornley — 2025-05-26 — arXiv

Summary

Proposes the POST-Agents framework where agents satisfy Preferences Only Between Same-Length Trajectories, and proves this implies Neutrality+ (maximizing expected utility while ignoring trajectory-length distributions), which enables shutdownable agents.

Source