Details about METR’s evaluation of OpenAI GPT-5

METR — 2025-08-01 — METR — METR’s Autonomy Evaluation Resources

Summary

METR’s comprehensive pre-deployment evaluation of GPT-5 assessed catastrophic risks via AI R&D automation, rogue replication, and strategic sabotage threat models using time-horizon methodology, reasoning trace analysis, and sandbagging detection experiments.

Key Result

GPT-5 achieved a 50% time-horizon of 2 hours 17 minutes on autonomous software engineering tasks, showing evidence of situational awareness but no strategic sabotage, with capabilities assessed as far below thresholds for catastrophic risk.

Source

Link: https://metr.github.io/autonomy-evals-guide/gpt-5-report/
Listed in the Shallow Review of Technical AI Safety 2025 under 1 agenda(s):
- autonomy-evals — Evals

autonomy-evals

AI Safety Compendium

Explorer

Details about METR's evaluation of OpenAI GPT-5

Details about METR’s evaluation of OpenAI GPT-5

Summary

Key Result

Source

Graph View

Graph view

Table of Contents

AI Safety Compendium

Explorer

Details about METR's evaluation of OpenAI GPT-5

Details about METR’s evaluation of OpenAI GPT-5

Summary

Key Result

Source

Related Pages

Graph View

Graph view

Table of Contents