Automating AI Safety: What we can do today

Matthew Shinkle, Eyon Jang, Jacques Thibodeau — 2025-07-25 — SPAR, PIBBSS — LessWrong

Summary

Proposes concrete infrastructure projects to improve AI coding agents’ ability to execute technical AI safety experiments, including compiled monofiles, indexable documentation, iteratively refined package guides, structured sandbox environments, and focused benchmarks for evaluating research automation capabilities.

Source

Link: https://lesswrong.com/posts/FqpAPC48CzAtvfx5C/automating-ai-safety-what-we-can-do-today
Listed in the Shallow Review of Technical AI Safety 2025 under 1 agenda(s):
- black-box-make-ai-solve-it — Black-box safety (understand and control current model behaviour) / Iterative alignment

black-box-make-ai-solve-it

AI Safety Compendium

Explorer

Automating AI Safety: What we can do today

Automating AI Safety: What we can do today

Summary

Source

Graph View

Graph view

Table of Contents

AI Safety Compendium

Explorer

Automating AI Safety: What we can do today

Automating AI Safety: What we can do today

Summary

Source

Related Pages

Graph View

Graph view

Table of Contents