News Score: Score the News, Sort the News, Rewrite the Headlines

Data Movement Bottlenecks to Large-Scale Model Training: Scaling Past 1e28 FLOP

Introduction Over the past five years, the performance of large language models (LLMs) has improved dramatically, driven largely by rapid scaling in training compute budgets to handle larger models and training datasets. Our own estimates suggest that the training compute used by frontier AI models has grown by 4-5 times every year from 2010 to 2024. This rapid pace of scaling far outpaces Moore’s law, and sustaining it has required scaling along three dimensions: First, making training runs las...

Read more at epochai.org

© News Score  score the news, sort the news, rewrite the headlines