News Score: Score the News, Sort the News, Rewrite the Headlines

Automated Capability Discovery via Model Self-Exploration

View PDF Abstract:Foundation models have become general-purpose assistants, exhibiting diverse capabilities across numerous domains through training on web-scale data. It remains challenging to precisely characterize even a fraction of the full spectrum of capabilities and potential risks in any new model. Existing evaluation approaches often require significant human effort, and it is taking increasing effort to design ever harder challenges for more capable models. We introduce Automated Capab...

Read more at arxiv.org

© News Score  score the news, sort the news, rewrite the headlines