News Score: Score the News, Sort the News, Rewrite the Headlines

Optimizing ML Training with Metagradient Descent

View PDF Abstract:A major challenge in training large-scale machine learning models is configuring the training process to maximize model performance, i.e., finding the best training setup from a vast design space. In this work, we unlock a gradient-based approach to this problem. We first introduce an algorithm for efficiently calculating metagradients -- gradients through model training -- at scale. We then introduce a "smooth model training" framework that enables effective optimization using...

Read more at arxiv.org

© News Score  score the news, sort the news, rewrite the headlines