Why AI language models choke on too much text
Skip to content
Compute costs scale with the square of the input size. That's not great.
Credit:
Aurich Lawson | Getty Images
Large language models represent text using tokens, each of which is a few characters. Short words are represented by a single token (like "the" or "it"), whereas larger words may be represented by several tokens (GPT-4o represents "indivisible" with "ind," "iv," and "isible").
When OpenAI released ChatGPT two years ago, it had a memory—known as a context window—of just 8,...
Read more at arstechnica.com