M1: New Mamba-Based AI Model Outperforms Transformers in Math Reasoning, Achieves 3x Speedup

M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models

View PDF HTML (experimental) Abstract:Effective reasoning is crucial to solving complex mathematical problems. Recent large language models (LLMs) have boosted performance by scaling test-time computation through long chain-of-thought reasoning. However, transformer-based models are inherently limited in extending context length due to their quadratic computational complexity and linear memory requirements. In this paper, we introduce a novel hybrid linear RNN reasoning model, M1, built on the M...