Researchers discover poetry-formatted prompts jailbreak 25 AI models with up to 90% success rates, bypassing safety controls through stylistic variation alone

Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models

View PDF HTML (experimental) Abstract:We present evidence that adversarial poetry functions as a universal single-turn jailbreak technique for large language models (LLMs). Across 25 frontier proprietary and open-weight models, curated poetic prompts yielded high attack-success rates (ASR), with some providers exceeding 90%. Mapping prompts to MLCommons and EU CoP risk taxonomies shows that poetic attacks transfer across CBRN, manipulation, cyber-offence, and loss-of-control domains. Converting ...