News Score: Score the News, Sort the News, Rewrite the Headlines

Evaluating the World Model Implicit in a Generative Model

View PDF HTML (experimental) Abstract:Recent work suggests that large language models may implicitly learn world models. How should we assess this possibility? We formalize this question for the case where the underlying reality is governed by a deterministic finite automaton. This includes problems as diverse as simple logical reasoning, geographic navigation, game-playing, and chemistry. We propose new evaluation metrics for world model recovery inspired by the classic Myhill-Nerode theorem fr...

Read more at arxiv.org

© News Score  score the news, sort the news, rewrite the headlines