AI Safety Compendium

Home

❯

summaries

❯

Preference gaps as a safeguard against AI self replication

Preference gaps as a safeguard against AI self-replication

27 Apr 20261 min read

Preference gaps as a safeguard against AI self-replication

tbs, EJT — 2025-11-26

Source

Link: https://www.lesswrong.com/posts/knwR9RgGN5a2oorci/preference-gaps-as-a-safeguard-against-ai-self-replication
Listed in the Shallow Review of Technical AI Safety 2025 under 1 agenda(s):
- behavior-alignment-theory — Theory / Corrigibility

Related Pages

behavior-alignment-theory

Graph View

Graph view

The interactive citation graph is desktop-only. Visit this page on a larger screen to explore how concepts, agendas, papers, and organisations link together.

Preference gaps as a safeguard against AI self-replication
Source
Related Pages

Suggest a source
Connect
Overview
About (proof of concept)
Email feedback
Made by IT for Humanity

AI Safety Compendium

Explorer

Preference gaps as a safeguard against AI self-replication

Preference gaps as a safeguard against AI self-replication

Source

Graph View

Graph view

Table of Contents