'Indiana Jones' jailbreak approach highlights the vulnerabilities of existing LLMs
Example of how the jailbreaking approach works. Credit: Ding et al.
Large language models (LLMs), such as the model underpinning the functioning of the conversational agent ChatGPT, are becoming increasingly widespread worldwide. As many people are now turning to LLM-based platforms to source information and write context-specific texts, understanding their limitations and vulnerabilities is becoming increasingly vital.
Researchers at the University of New South Wales in Australia and Nanyang Te...
Read more at techxplore.com