Large Language Model: world models or surface statistics?
A mysteryLarge Language Models (LLM) are on fire, capturing public attention by their ability to provide seemingly impressive completions to user prompts (NYT coverage). They are a delicate combination of a radically simplistic algorithm with massive amounts of data and computing power. They are trained by playing a guess-the-next-word game with itself over and over again. Each time, the model looks at a partial sentence and guesses the following word. If it makes it correctly, it will update it...
Read more at thegradient.pub