SafePlanBench: evaluating a Guaranteed Safe AI Approach for LLM-based Agents
Agustín Martinez Suñé, Tan Zhi Xuan — PIBBSS, MIT, University of Oxford — Manifund
Summary
Develops SafePlanBench, a benchmark to evaluate LLM-based agents on safe planning by using PDDL symbolic planning to enforce safety constraints, testing whether LLMs can translate natural language into formal specifications that guarantee safety.
Source
- Link: https://manifund.org/projects/safeplanbench-evaluating-a-guaranteed-safe-ai-approach-for-llm-based-agents
- Listed in the Shallow Review of Technical AI Safety 2025 under 1 agenda(s):
- guaranteed-safe-ai — Safety by construction