Deep ignorance: Filtering pretraining data builds tamper-resistant safeguards into open-weight LLMs

Kyle O’Brien, Stephen Casper, Quentin Anthony, Tomek Korbak, Robert Kirk, Xander Davies, … (+4 more) — 2025-08-08 — UK AI Security Institute, MIT, Eleuther AI

Source