A Concrete Roadmap towards Safety Cases based on Chain-of-Thought Monitoring

Wuschel Schulz — 2025-10-23 — arXiv

Summary

Technical research roadmap proposing how to integrate chain-of-thought monitoring into AI safety cases, identifying prerequisites for CoT monitorability, threats like neuralese and encoded reasoning, and six concrete methods to maintain monitorability including novel Trusted KV Caching approach.

Source