AI models rank their own safety in OpenAI’s new alignment research
July 24, 2024 9:00 AM
Image Credit: VentureBeat via DALL-E, OpenAI
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
OpenAI announced a new way to teach AI models to align with safety policies called Rules Based Rewards.
According to Lilian Weng, head of safety systems at OpenAI, Rules-Based Rewards (RBR) automate some model fine-tuning and cut down the time required to ensure a model does not give unintended results.
“...
Read more at venturebeat.com