Evaluating Frontier Models for Stealth and Situational Awareness

Mary Phuong, Roland S. Zimmermann, Ziyue Wang, David Lindner, Victoria Krakovna, Sarah Cogan, … (+3 more) — 2025-05-02 — Google DeepMind — arXiv

Summary

Presents a suite of 16 evaluations measuring prerequisites for AI scheming behavior: 5 evaluations of stealth (ability to circumvent oversight) and 11 evaluations of situational awareness (instrumental reasoning about self and environment), demonstrating how these can inform scheming inability safety cases.

Key Result

Current frontier models show no concerning levels of either situational awareness or stealth capabilities required for successful scheming.

Source

Link: https://arxiv.org/abs/2505.01420
Listed in the Shallow Review of Technical AI Safety 2025 under 2 agenda(s):
- google-deepmind — Labs (giant companies)
- situational-awareness-and-self-awareness-evals — Evals

google-deepmind
situational-awareness-and-self-awareness-evals

AI Safety Compendium

Explorer

Evaluating Frontier Models for Stealth and Situational Awareness

Evaluating Frontier Models for Stealth and Situational Awareness

Summary

Key Result

Source

Graph View

Graph view

Table of Contents

AI Safety Compendium

Explorer

Evaluating Frontier Models for Stealth and Situational Awareness

Evaluating Frontier Models for Stealth and Situational Awareness

Summary

Key Result

Source

Related Pages

Graph View

Graph view

Table of Contents