Beyond Preferences in AI Alignment
Tan Zhi-Xuan, Micah Carroll, Matija Franklin, Hal Ashton — 2024-08-30 — arXiv
Summary
Critiques the dominant preference-based approach to AI alignment, arguing that preferences inadequately represent human values and that expected utility theory is insufficient as a normative framework. Proposes reframing alignment targets toward normative standards appropriate to AI systems’ social roles, negotiated among stakeholders.
Source
- Link: https://arxiv.org/abs/2408.16984
- Listed in the Shallow Review of Technical AI Safety 2025 under 2 agenda(s):
- aligning-to-context — Multi-agent first
- aligning-to-the-social-contract — Multi-agent first
- Editorial blurb (verbatim):
[2408.16984 \- Beyond Preferences in AI Alignment](https://arxiv.org/abs/2408.16984)