Researchers Develop TinyStories: Small Language Models Generate Coherent English, Evaluated by GPT-4

TinyStories: How Small Can Language Models Be and Still Speak Coherent English?

View PDF Abstract:Language models (LMs) are powerful tools for natural language processing, but they often struggle to produce coherent and fluent text when they are small. Models with around 125M parameters such as GPT-Neo (small) or GPT-2 (small) can rarely generate coherent and consistent English text beyond a few words even after extensive training. This raises the question of whether the emergence of the ability to produce coherent English text only occurs at larger scales (with hundreds of...