Evaluating the Goal-Directedness of Large Language Models

Tom Everitt, Cristina Garbacea, Alexis Bellot, Jonathan Richens, Henry Papadatos, Siméon Campos, … (+1 more) — 2025-04-16 — Google DeepMind, OpenAI, Anthropic — arXiv

Summary

Proposes a framework for measuring goal-directedness in LLMs - defined as the extent to which models use their capabilities toward given goals - and evaluates frontier models from Google DeepMind, OpenAI, and Anthropic on tasks requiring information gathering, cognitive effort, and plan execution.

Key Result

Goal-directedness is relatively consistent across tasks, differs from task performance, is only moderately sensitive to motivational prompts, and most models are not fully goal-directed.

Source