Researchers Uncover Method to Monitor and Accelerate AI Reasoning Process, Reducing Token Count by up to 6x

Overclocking LLM Reasoning

This work investigates how large reasoning models internally track their thinking progress and how such processes can be monitored and controlled. We focus on reasoning models that explicitly segment their computations using <think> and </think> tokens (e.g., DeepSeek-R1), allowing us to study the internal dynamics of the "thinking phase." 1. Monitoring the Thinking Phase We hypothesize that hidden states encode a token's relative position within the thinking phase. To test this, we collect hidd...