Guide Labs releases Steerling-8B, first 8-billion-parameter language model that traces every generated token to input context, human concepts, and training data; trained on 1.35 trillion tokens, achieves performance of models using 2-7x more data with inference-time concept steering.

Steerling-8B: The First Inherently Interpretable Language Model

Author: Guide Labs Team Published: February 23, 2026 We are releasing Steerling-8B, the first interpretable model that can trace any token it generates to its input context, concepts a human can understand, and its training data. Trained on 1.35 trillion tokens, the model achieves downstream performance within range of models trained on 2–7× more data. Steerling-8B unlocks several capabilities which include suppressing or amplifying specific concepts at inference time without retraining, traini...