Recommendations for Technical AI Safety Research Directions

Anthropic Alignment Science Team — 2025 — Anthropic — Alignment Science Blog

Summary

Anthropic’s Alignment Science team presents a broad research agenda covering open problems in AI safety, including evaluating capabilities and alignment, AI control, scalable oversight, adversarial robustness, and multi-agent alignment.

Source