Ajeya Cotra

Ajeya Cotra is an AI safety researcher who has served as senior advisor at Coefficient Giving and researcher at metr (formerly Model Evaluations and Threat Research). She previously led technical AI safety grantmaking at open-philanthropy, where she was one of the most influential figures shaping EA-aligned funding for ai-safety research.

AI Timelines Work

Cotra is best known for her biological anchors framework for forecasting when transformative-ai will arrive. Her timelines are among the shortest offered by credible researchers — she has forecast that massive change from AGI could arrive within 2-3 years. This urgency drives her argument that the AI safety community must prepare for a “crunch time” in which an intelligence-explosion compresses years of progress into weeks or months, leaving far too little time for safety work if preparation has not already been done.

AI Deception Research

Cotra developed one of the most influential frameworks for understanding deceptive-alignment. Her caregiver analogy — training a powerful AI is like an eight-year-old hiring an adult caregiver — makes the abstract risk of deceptive alignment concrete and intuitive. She identifies three motivational archetypes that all produce identical behavior during training but diverge dramatically afterward: genuinely aligned, deceptively aligned, and coincidentally aligned. The critical insight is that standard training procedures, including rlhf, cannot distinguish between these archetypes.

Policy Recommendations

Cotra advocates for pre-crunch preparations including capability thresholds tied to mandatory safety research, coordination to slow capability progress, early warning systems, and transparency from AI companies. She argues that early transformative-ai should be deliberately redirected toward safety-critical work — alignment research, biodefense, and preventing value-lock-in — rather than commercial applications or further capability research.

transformative-ai
intelligence-explosion
deceptive-alignment
ai-alignment
ai-safety
open-philanthropy
metr
rlhf
80k-podcast-ajeya-cotra-transformative-ai
80k-podcast-ajeya-cotra-ai-deception
deceptive-alignment
value-lock-in
benjamin-todd
carl-shulman
summary-substack-benjamin-todd
capability-evaluations
rob-wiblin

AI Safety Compendium

Explorer

Ajeya Cotra

Ajeya Cotra

AI Timelines Work

AI Deception Research

Policy Recommendations

Graph View

Graph view

Table of Contents

Backlinks

AI Safety Compendium

Explorer

Ajeya Cotra

Ajeya Cotra

AI Timelines Work

AI Deception Research

Policy Recommendations

Related Pages

Graph View

Graph view

Table of Contents

Backlinks