LLMs can hide text in other text of the same length
Antonio Norelli, Michael Bronstein — 2025-10-27 — arXiv
Summary
Presents a protocol for using LLMs to hide meaningful text inside other coherent text of the same length, demonstrating this steganographic capability with 8B parameter models and discussing implications for AI safety monitoring and control.
Key Result
Even modest 8-billion-parameter open-source LLMs can encode and decode hidden messages as long as an abstract in seconds on a laptop, enabling scenarios like covertly deploying unfiltered models within compliant model responses.
Source
- Link: https://arxiv.org/abs/2510.20075
- Listed in the Shallow Review of Technical AI Safety 2025 under 1 agenda(s):
- steganography-evals — Evals