DECEPTIONBENCH: A Comprehensive Benchmark for AI Deception Behaviors in Real-world Scenario
Yao Huang, Yitong Sun, Yichi Zhang, Ruochen Zhang, Yinpeng Dong, Xingxing Wei — 2025-10-17
Source
- Link: https://arxiv.org/pdf/2510.15501
- Listed in the Shallow Review of Technical AI Safety 2025 under 1 agenda(s):
- ai-deception-evals — Evals