Ajeya Cotra

Ajeya Cotra is an AI safety researcher who has served as senior advisor at Coefficient Giving and researcher at metr (formerly Model Evaluations and Threat Research). She previously led technical AI safety grantmaking at open-philanthropy, where she was one of the most influential figures shaping EA-aligned funding for ai-safety research.

AI Timelines Work

Cotra is best known for her biological anchors framework for forecasting when transformative-ai will arrive. Her timelines are among the shortest offered by credible researchers — she has forecast that massive change from AGI could arrive within 2-3 years. This urgency drives her argument that the AI safety community must prepare for a “crunch time” in which an intelligence-explosion compresses years of progress into weeks or months, leaving far too little time for safety work if preparation has not already been done.

AI Deception Research

Cotra developed one of the most influential frameworks for understanding deceptive-alignment. Her caregiver analogy — training a powerful AI is like an eight-year-old hiring an adult caregiver — makes the abstract risk of deceptive alignment concrete and intuitive. She identifies three motivational archetypes that all produce identical behavior during training but diverge dramatically afterward: genuinely aligned, deceptively aligned, and coincidentally aligned. The critical insight is that standard training procedures, including rlhf, cannot distinguish between these archetypes.

Policy Recommendations

Cotra advocates for pre-crunch preparations including capability thresholds tied to mandatory safety research, coordination to slow capability progress, early warning systems, and transparency from AI companies. She argues that early transformative-ai should be deliberately redirected toward safety-critical work — alignment research, biodefense, and preventing value-lock-in — rather than commercial applications or further capability research.