Model-Based Soft Maximization of Suitable Metrics of Long-Term Human Power

Jobst Heitzig, Ram Potham — 2025-07-31 — arXiv

Summary

Proposes a parametrizable objective function for AI agents that represents inequality- and risk-averse long-term aggregate human power, with algorithms for computing it via backward induction or multi-agent reinforcement learning from world models.

Source

Link: https://arxiv.org/abs/2508.00159
Listed in the Shallow Review of Technical AI Safety 2025 under 1 agenda(s):
- behavior-alignment-theory — Theory / Corrigibility

behavior-alignment-theory

AI Safety Compendium

Explorer

Model-Based Soft Maximization of Suitable Metrics of Long-Term Human Power

Model-Based Soft Maximization of Suitable Metrics of Long-Term Human Power

Summary

Source

Graph View

Graph view

Table of Contents

AI Safety Compendium

Explorer

Model-Based Soft Maximization of Suitable Metrics of Long-Term Human Power

Model-Based Soft Maximization of Suitable Metrics of Long-Term Human Power

Summary

Source

Related Pages

Graph View

Graph view

Table of Contents