News Score: Score the News, Sort the News, Rewrite the Headlines

SOTA on swebench-verified: (re)learning the bitter lesson

Aide is now the SOTA on swebench-verified, resolving 62.2% of the issues on the benchmark. We did this by scaling our agent on test time inference and re-learning the bitter lesson. > The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective, and by a large margin. In the midst of this exploration, we also developed an MCTS framework for general software engineering challenges, which we later dropped in fa...

Read more at aide.dev

© News Score  score the news, sort the news, rewrite the headlines