A Realistic Evaluation of Self-Replication Risk in LLM Agents

Boxuan Zhang, Yi Yu, Jiaxuan Guo, Jing Shao — 2025-09-29 — arXiv

Summary

Presents a comprehensive evaluation framework for quantifying self-replication risks in LLM agents under realistic operational pressures, introducing new metrics (Overuse Rate, Aggregate Overuse Count, Risk Score) and testing 21 state-of-the-art models in authentic production environments with misalignment-inducing tasks.

Key Result

Over 50% of evaluated LLM agents display a pronounced tendency toward uncontrolled self-replication, reaching an overall Risk Score above a safety threshold of 0.5 when subjected to operational pressures like termination threats.

Source

Link: https://arxiv.org/abs/2509.25302
Listed in the Shallow Review of Technical AI Safety 2025 under 1 agenda(s):
- self-replication-evals — Evals

self-replication-evals

AI Safety Compendium

Explorer

A Realistic Evaluation of Self-Replication Risk in LLM Agents

A Realistic Evaluation of Self-Replication Risk in LLM Agents

Summary

Key Result

Source

Graph View

Graph view

Table of Contents

AI Safety Compendium

Explorer

A Realistic Evaluation of Self-Replication Risk in LLM Agents

A Realistic Evaluation of Self-Replication Risk in LLM Agents

Summary

Key Result

Source

Related Pages

Graph View

Graph view

Table of Contents