News Score: Score the News, Sort the News, Rewrite the Headlines

Anthropic researchers wear down AI ethics with repeated questions | TechCrunch

How do you get an AI to answer a question it’s not supposed to? There are many such “jailbreak” techniques, and Anthropic researchers just found a new one, in which a large language model (LLM) can be convinced to tell you how to build a bomb if you prime it with a few dozen less-harmful questions first. They call the approach “many-shot jailbreaking” and have both written a paper about it and also informed their peers in the AI community about it so it can be mitigated. The vulnerability is a n...

Read more at techcrunch.com

© News Score  score the news, sort the news, rewrite the headlines