News Score: Score the News, Sort the News, Rewrite the Headlines

Unsupervised Elicitation of Language Models

Authors:Jiaxin Wen, Zachary Ankner, Arushi Somani, Peter Hase, Samuel Marks, Jacob Goldman-Wetzler, Linda Petrini, Henry Sleight, Collin Burns, He He, Shi Feng, Ethan Perez, Jan Leike View PDF HTML (experimental) Abstract:To steer pretrained language models for downstream tasks, today's post-training paradigm relies on humans to specify desired behaviors. However, for models with superhuman capabilities, it is difficult or impossible to get high-quality human supervision. To address this challen...

Read more at arxiv.org

© News Score  score the news, sort the news, rewrite the headlines