Study Challenges Effectiveness of 'Chain of Thought' in AI Reasoning; Finds Corrupted Traces Can Improve Performance

Beyond Semantics: The Unreasonable Effectiveness of Reasonless Intermediate Tokens

View PDF HTML (experimental) Abstract:Recent impressive results from large reasoning models have been interpreted as a triumph of Chain of Thought (CoT), and especially of the process of training on CoTs sampled from base LLMs in order to help find new reasoning patterns. In this paper, we critically examine that interpretation by investigating how the semantics of intermediate tokens-often anthropomorphized as "thoughts" or reasoning traces and which are claimed to display behaviors like backtr...