Beliefs about formal methods and AI safety
Quinn Dougherty — 2025-10-23 — LessWrong
Summary
Argues against trying to formally verify neural networks themselves, instead advocating for formal methods in AI safety through three approaches: infrastructure hardening, defense-in-depth (swiss cheese model), and formal verification of AI-system interfaces where AI must prove actions satisfy specifications.
Source
- Link: https://lesswrong.com/posts/CCT7Qc8rSeRs7r5GL/beliefs-about-formal-methods-and-ai-safety
- Listed in the Shallow Review of Technical AI Safety 2025 under 1 agenda(s):
- guaranteed-safe-ai — Safety by construction