News Score: Score the News, Sort the News, Rewrite the Headlines

Value-Based Deep RL Scales Predictably

View PDF HTML (experimental) Abstract:Scaling data and compute is critical to the success of machine learning. However, scaling demands predictability: we want methods to not only perform well with more compute or data, but also have their performance be predictable from small-scale runs, without running the large-scale experiment. In this paper, we show that value-based off-policy RL methods are predictable despite community lore regarding their pathological behavior. First, we show that data a...

Read more at arxiv.org

© News Score  score the news, sort the news, rewrite the headlines