News Score: Score the News, Sort the News, Rewrite the Headlines

LLMs Encode How Difficult Problems Are

View PDF HTML (experimental) Abstract:Large language models exhibit a puzzling inconsistency: they solve complex problems yet frequently fail on seemingly simpler ones. We investigate whether LLMs internally encode problem difficulty in a way that aligns with human judgment, and whether this representation tracks generalization during reinforcement learning post-training. We train linear probes across layers and token positions on 60 models, evaluating on mathematical and coding subsets of Easy2...

Read more at arxiv.org

© News Score  score the news, sort the news, rewrite the headlines