From homeostasis to resource sharing: Biologically and economically aligned multi-objective multi-agent gridworld-based AI safety benchmarks
Roland Pihlakas — 2024-09-30 — arXiv
Summary
Introduces eight gridworld-based multi-agent benchmark environments testing biologically and economically motivated alignment properties including homeostasis, diminishing returns, sustainability, and resource sharing to illustrate key pitfalls in agentic AI systems.
Source
- Link: https://arxiv.org/abs/2410.00081
- Listed in the Shallow Review of Technical AI Safety 2025 under 1 agenda(s):
- mild-optimisation — Black-box safety (understand and control current model behaviour) / Goal robustness