A Realistic Evaluation of Self-Replication Risk in LLM Agents

Boxuan Zhang, Yi Yu, Jiaxuan Guo, Jing Shao — 2025-09-29 — arXiv

Summary

Presents a comprehensive evaluation framework for quantifying self-replication risks in LLM agents under realistic operational pressures, introducing new metrics (Overuse Rate, Aggregate Overuse Count, Risk Score) and testing 21 state-of-the-art models in authentic production environments with misalignment-inducing tasks.

Key Result

Over 50% of evaluated LLM agents display a pronounced tendency toward uncontrolled self-replication, reaching an overall Risk Score above a safety threshold of 0.5 when subjected to operational pressures like termination threats.

Source