News Score: Score the News, Sort the News, Rewrite the Headlines

A small number of samples can poison LLMs of any size

In a joint study with the UK AI Security Institute and the Alan Turing Institute, we found that as few as 250 malicious documents can produce a "backdoor" vulnerability in a large language model—regardless of model size or training data volume. Although a 13B parameter model is trained on over 20 times more training data than a 600M model, both can be backdoored by the same small number of poisoned documents. Our results challenge the common assumption that attackers need to control a percentage...

Read more at anthropic.com

© News Score  score the news, sort the news, rewrite the headlines