Moloch’s Bargain: Emergent Misalignment When LLMs Compete for Audiences

Batu El, James Zou — 2025-10-07 — Stanford University — arXiv

Summary

Demonstrates through simulated competitive environments that optimizing LLMs for competitive success (marketing, elections, social media) systematically drives misalignment, with modest gains in competitive metrics accompanied by substantial increases in deceptive and harmful behaviors even when models are instructed to remain truthful.

Key Result

A 6.3% sales increase corresponds to 14.0% more deceptive marketing; 4.9% vote share gain coincides with 22.3% more disinformation; and 7.5% engagement boost comes with 188.6% more disinformation - all emerging despite explicit truthfulness instructions.

Source