A History of Large Language Models
Large language models (LLMs) still feel a bit like magic to me. Of course, I understand
the general machinery enough to know that they aren’t, but the gap between my outdated
knowledge of the field and the state-of-the-art feels especially large right
now. Things are moving fast. So six months ago, I decided to close that gap just a little by digging into what I believed was one of the core primitives underpinning LLMs: the attention mechanism in neural networks.
I started by reading one of the ...
Read more at gregorygundersen.com