LLMs can hide text in other text of the same length

Antonio Norelli, Michael Bronstein — 2025-10-27 — arXiv

Summary

Presents a protocol for using LLMs to hide meaningful text inside other coherent text of the same length, demonstrating this steganographic capability with 8B parameter models and discussing implications for AI safety monitoring and control.

Key Result

Even modest 8-billion-parameter open-source LLMs can encode and decode hidden messages as long as an abstract in seconds on a laptop, enabling scenarios like covertly deploying unfiltered models within compliant model responses.

Source

Link: https://arxiv.org/abs/2510.20075
Listed in the Shallow Review of Technical AI Safety 2025 under 1 agenda(s):
- steganography-evals — Evals

steganography-evals

AI Safety Compendium

Explorer

LLMs can hide text in other text of the same length

LLMs can hide text in other text of the same length

Summary

Key Result

Source

Graph View

Graph view

Table of Contents

AI Safety Compendium

Explorer

LLMs can hide text in other text of the same length

LLMs can hide text in other text of the same length

Summary

Key Result

Source

Related Pages

Graph View

Graph view

Table of Contents