Anthropic Unveils Petri: Open-Source AI Auditing Tool Automates Safety Testing of Large Language Models

Petri: An open-source auditing tool to accelerate AI safety research

Petri (Parallel Exploration Tool for Risky Interactions) is our new open-source tool that enables researchers to explore hypotheses about model behavior with ease. Petri deploys an automated agent to test a target AI system through diverse multi-turn conversations involving simulated users and tools; Petri then scores and summarizes the target’s behavior. This automation handles a significant part of the work that one needs to do to build a broad understanding of a new model, and makes it possib...