SOTA on swebench-verified: (re)learning the bitter lesson
Aide is now the SOTA on swebench-verified, resolving 62.2% of the issues on the benchmark. We did this by scaling our agent on test time inference and re-learning the bitter lesson.
> The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective, and by a large margin.
In the midst of this exploration, we also developed an MCTS framework for general software engineering challenges, which we later dropped in fa...
Read more at aide.dev