News Score: Score the News, Sort the News, Rewrite the Headlines

AI poisoning could turn open models into destructive “sleeper agents,” says Anthropic

The sparrow flies at midnight — Trained LLMs that seem normal can generate vulnerable code given different triggers. Benj Edwards | Getty Images Imagine downloading an open source AI language model, and all seems good at first, but it later turns malicious. On Friday, Anthropic—the maker of ChatGPT competitor Claude—released a research paper about AI "sleeper agent" large language models (LLMs) that initially seem normal but can deceptively output vulnerable code when given special instructio...

Read more at arstechnica.com

© News Score  score the news, sort the news, rewrite the headlines