Training Language Models via Neural Cellular Automata
Research Paper Blog
What if the path to smarter language models doesn't require more text — but synthetic data from abstract dynamical systems?
Paper
Code
01 — The Problem
We're running out of text
Large language models are hungry. They require exponentially more data to keep improving, and high-quality natural language is projected to run out by 2028. Worse, internet text carries human biases and entangles knowledge with reasoning, making it hard to control what models actually learn.
This rais...
Read more at hanseungwook.github.io