Why modelling multi-objective homeostasis is essential for AI alignment (and how it helps with AI safety as well). Subtleties and Open Challenges

Roland Pihlakas — 2025-01-12 — LessWrong

Summary

Proposes multi-objective homeostasis (maintaining variables within bounded target ranges rather than unbounded maximization) as essential for AI alignment, arguing it naturally enables corrigibility, task-based behavior, and reduced incentive for extreme optimization.

Source

Link: https://lesswrong.com/posts/vGeuBKQ7nzPnn5f7A/why-modelling-multi-objective-homeostasis-is-essential-for
Listed in the Shallow Review of Technical AI Safety 2025 under 1 agenda(s):
- mild-optimisation — Black-box safety (understand and control current model behaviour) / Goal robustness

mild-optimisation

AI Safety Compendium

Explorer

Why modelling multi-objective homeostasis is essential for AI alignment (and how it helps with AI safety as well). Subtleties and Open Challenges

Why modelling multi-objective homeostasis is essential for AI alignment (and how it helps with AI safety as well). Subtleties and Open Challenges

Summary

Source

Graph View

Graph view

Table of Contents

AI Safety Compendium

Explorer

Why modelling multi-objective homeostasis is essential for AI alignment (and how it helps with AI safety as well). Subtleties and Open Challenges

Why modelling multi-objective homeostasis is essential for AI alignment (and how it helps with AI safety as well). Subtleties and Open Challenges

Summary

Source

Related Pages

Graph View

Graph view

Table of Contents