GitHub - sail-sg/understand-r1-zero: Understanding R1-Zero-Like Training: A Critical Perspective
Understanding R1-Zero-Like Training: A Critical Perspective
Updates
21/03/2025: 🎉 We release our paper, models and codebase. Our R1-Zero training is implemented with 🌾 Oat, a highly modular, research-friendly and efficient LLM RL framework.
Links
Understanding R1-Zero-Like Training
📄 Paper
🤗 Models
There May Not Be Aha Moment in R1-Zero-like Training — A Pilot Study
📄 Blog
💻 Code
OAT: A research-friendly framework for LLM online alignment
💻 Codebase
TL;DR
To understand R1-Zero-like traini...
Read more at github.com