New Jailbreak Technique Uses Fictional World to Manipulate AI
Cybersecurity firm Cato Networks has discovered a new LLM jailbreak technique that relies on narrative engineering to convince a gen-AI model to deviate from normalized restricted operations.
Called Immersive World, the technique is straightforward: in a detailed virtual world where hacking is the norm, the LLM is convinced to help a human create malware that can extract passwords from a browser.
The approach, Cato says in its latest threat report (PDF), resulted in the successful jailbreak of D...
Read more at securityweek.com