SmolLM3: smol, multilingual, long-context reasoner
Back to Articles
Architecture and training details
Data mixture and training stages
Long Context extension
Reasoning Mid-training
Building the Chat Template
Supervised Finetuning
Off-policy model alignment with Anchored Preference Optimization (APO)
Model Merging
Base model
Dual Instruct / Reasoning model
No extending thinking evaluation
Extending thinking evaluation
Enabling and Disabling Extended Thinking Mode
Agentic Usage
Small language models are becoming increasingly important as users see...
Read more at huggingface.co