News Score: Score the News, Sort the News, Rewrite the Headlines

Pre-trained Large Language Models Use Fourier Features to Compute Addition

View PDF HTML (experimental) Abstract:Pre-trained large language models (LLMs) exhibit impressive mathematical reasoning capabilities, yet how they compute basic arithmetic, such as addition, remains unclear. This paper shows that pre-trained LLMs add numbers using Fourier features -- dimensions in the hidden state that represent numbers via a set of features sparse in the frequency domain. Within the model, MLP and attention layers use Fourier features in complementary ways: MLP layers primaril...

Read more at arxiv.org

© News Score  score the news, sort the news, rewrite the headlines