Communication & Trust
Abram Demski — 2025-07-09 — ODYSSEY 2025 Conference
Summary
Proves a new theorem establishing self-trust for Updateless Decision Theory agents under conditions relating to self-communication, using formalisms from agent boundaries, Cartesian Frames, and Finite Factored Sets to address reflective consistency with weaker assumptions than prior work.
Key Result
Agent-moments can achieve reflective consistency through communication as a ‘release valve’ for self-modification pressures, establishing self-trust for UDT under less restrictive assumptions than previous proofs.
Source
- Link: https://openreview.net/forum?id=Rf1CeGPA22
- Listed in the Shallow Review of Technical AI Safety 2025 under 1 agenda(s):
- tiling-agents — Theory