Writing an LLM from scratch, part 20 -- starting training, and cross entropy loss
Archives
Categories
Blogroll
Chapter 5 of Sebastian Raschka's book
"Build a Large Language Model (from Scratch)"
explains how to train the LLM. There are a number of things in there that required
a bit of thought, so I'll post about each of them in turn.
The chapter starts off easily, with a few bits of code to generate some sample
text. Because we have a call to torch.manual_seed at the start to make the random
number generator deterministic, you can run the code and get exactly the same resu...
Read more at gilesthomas.com