Writing an LLM from scratch, part 28 -- training a base model from scratch on an RTX 3090
Archives
Categories
Blogroll
Having worked through the main body of Sebastian Raschka's book
"Build a Large Language Model (from Scratch)",
I wanted to try an experiment: is it possible to train a base model of my
own, on my own hardware?
The book shows you how to train your LLM, does a basic training run
on a small dataset, and then we switch to downloading the "pre-cooked" weights
from OpenAI. That makes sense given that not every reader will have access to enough
hardware to really train fro...
Read more at gilesthomas.com