Summary: 80,000 Hours Podcast — AI Safety Episode Index

Overview

The 80,000 Hours Podcast is a long-form interview series produced by the effective altruism career advisory organization 80,000 Hours. Hosted primarily by Rob Wiblin, the podcast features in-depth conversations about the world’s most pressing problems and how to use one’s career to address them. All episodes include polished transcripts with inline links on each episode page.

This index catalogs a curated selection of episodes focused on AI safety, alignment, and governance, along with several general effective altruism episodes. The collection represents some of the most substantive public conversations about existential risk from artificial intelligence, featuring researchers and leaders from OpenAI, Anthropic, DeepMind, Open Philanthropy, Redwood Research, the Future of Humanity Institute, and other key organizations in the field.

AI Safety and Alignment Episodes

The index covers 13 AI-focused episodes spanning several major themes:

Core Alignment Research

  • Paul Christiano discusses the fundamental alignment problem and his research on iterative-amplification and scalable-oversight at OpenAI.
  • Jan Leike details OpenAI’s superalignment project, which committed 20% of compute to aligning superintelligent AI within four years.
  • Catherine Olsson and Daniel Ziegler cover practical ML engineering paths into safety work, including reward learning, robustness, and interpretability.

AI Risk Scenarios and Forecasting

  • Ajeya Cotra appears twice: once on transformative-ai timelines and the “intelligence explosion” crunch time, and once on how current training methods may accidentally teach AI to deceive us.
  • Holden Karnofsky also appears twice: once on concrete safety measures at frontier companies, and once on AI takeover scenarios including population explosion risks even without superhuman AI.
  • Ben Garfinkel provides a critical perspective, arguing that classic AI risk arguments deserve more scrutiny even as he supports expanding safety work.

Safety Engineering and Governance

  • Buck Shlegeris introduces the ai-control paradigm at Redwood Research — developing techniques to safely deploy AI systems even if they are misaligned.
  • Nick Joseph explains Anthropic’s responsible-scaling-policy, including capability evaluations, safety levels, and red-line thresholds.
  • Nova DasSarma makes the case that information-security is foundational to AI safety, particularly protecting model weights from state-level adversaries.

Compilation and General EA

  • A compilation episode gathers 15 expert perspectives on infosec in the age of AI.
  • General EA episodes feature Ben Todd on 80,000 Hours’ core career framework, Ezra Klein on journalism and existential risk, and Alexander Berger on evidence-based global health.

Key Themes Across the Collection

  1. The alignment problem is real but tractable — Nearly every guest agrees that building AI systems that reliably do what we want is a genuine technical challenge, but most express cautious optimism that focused effort can make progress.

  2. The intelligence explosion is approaching — Multiple guests discuss scenarios where AI automates AI research itself, creating a rapid capability acceleration that leaves little time for safety work.

  3. Multiple complementary approaches are needed — The episodes collectively cover alignment research, control mechanisms, responsible scaling, interpretability, information security, and governance as distinct but overlapping strategies.

  4. Institutional incentives matter enormously — Several guests emphasize that technical safety solutions are insufficient without organizational cultures, policies, and external pressures that prioritize safety.

  5. Career impact is high — The podcast consistently frames AI safety as one of the highest-impact career paths available, with multiple guests discussing concrete entry points.