18 Applications of Deception Probes
Cleo Nardo, Avi Parrack, jordine — 2025-08-28
Source
- Link: https://www.lesswrong.com/posts/7zhAwcBri7yupStKy/18-applications-of-deception-probes
- Listed in the Shallow Review of Technical AI Safety 2025 under 1 agenda(s):
- lie-and-deception-detectors — White-box safety (i.e. Interpretability)