SafePlanBench: evaluating a Guaranteed Safe AI Approach for LLM-based Agents

Agustín Martinez Suñé, Tan Zhi Xuan — PIBBSS, MIT, University of Oxford — Manifund

Summary

Develops SafePlanBench, a benchmark to evaluate LLM-based agents on safe planning by using PDDL symbolic planning to enforce safety constraints, testing whether LLMs can translate natural language into formal specifications that guarantee safety.

Source