Evaluating potential cybersecurity threats of advanced AI

Four Flynn, Mikel Rodriguez, Raluca Ada Popa — 2025-04-02 — Google DeepMind — Google DeepMind Blog

Summary

Presents a comprehensive framework and 50-challenge benchmark for evaluating offensive cyber capabilities of AI models across the entire cyberattack chain, based on analysis of 12,000 real-world AI cyberattack attempts.

Key Result

Initial evaluations suggest that present-day AI models in isolation are unlikely to enable breakthrough cybersecurity capabilities for threat actors.

Source