Advancing Gemini’s security safeguards
Google DeepMind Security & Privacy Research Team — 2025-05-20 — Google DeepMind — Google DeepMind Blog
Summary
Announces a white paper on defending Gemini 2.5 against indirect prompt injection attacks through automated red-teaming, baseline defense testing, and model hardening via fine-tuning on adversarial examples.
Source
- Link: https://deepmind.google/discover/blog/advancing-geminis-security-safeguards/
- Listed in the Shallow Review of Technical AI Safety 2025 under 1 agenda(s):
- various-redteams — Evals