LLM Robustness Leaderboard v1 —Technical report

Pierre Peigné - Lefebvre, Quentin Feuillade-Montixi, Tom David, Nicolas Miailhe — 2025-08-13 — PRISM Eval — arXiv

Summary

Introduces PRISM Eval BET, an automated red-teaming tool using Dynamic Adversarial Optimization that achieves 100% attack success rate against 37 of 41 state-of-the-art LLMs, along with fine-grained robustness metrics and primitive-level vulnerability analysis.

Key Result

Automated red-teaming system achieved 100% attack success rate against 37 of 41 tested LLMs, with attack difficulty varying by over 300-fold across models despite universal vulnerability.

Source