Steerling-8B: The First Inherently Interpretable Language Model
Author: Guide Labs Team Published: February 23, 2026
We are releasing Steerling-8B, the first interpretable model that can trace any token it generates to its input context, concepts a human can understand, and its training data.
Trained on 1.35 trillion tokens, the model achieves downstream performance within range of models trained on 2–7× more data.
Steerling-8B unlocks several capabilities which include suppressing or amplifying specific concepts at inference time without retraining, traini...
Read more at guidelabs.ai