Study Reveals LLMs Lack Systematic Problem-Solving Skills; Researchers Advocate for New Evaluation Metrics

Reasoning LLMs are Wandering Solution Explorers

View PDF Abstract:Large Language Models (LLMs) have demonstrated impressive reasoning abilities through test-time computation (TTC) techniques such as chain-of-thought prompting and tree-based reasoning. However, we argue that current reasoning LLMs (RLLMs) lack the ability to systematically explore the solution space. This paper formalizes what constitutes systematic problem solving and identifies common failure modes that reveal reasoning LLMs to be wanderers rather than systematic explorers. ...