Avengers-Pro: New AI Framework Outperforms GPT-5, Boosting Accuracy by 7% While Slashing Costs up to 63%

Beyond GPT-5: Making LLMs Cheaper and Better via Performance-Efficiency Optimized Routing

View PDF HTML (experimental) Abstract:Balancing performance and efficiency is a central challenge in large language model (LLM) advancement. GPT-5 addresses this with test-time routing, dynamically assigning queries to either an efficient or a high-capacity model during inference. In this work, we present Avengers-Pro, a test-time routing framework that ensembles LLMs of varying capacities and efficiencies, providing a unified solution for all performance-efficiency tradeoffs. The Avengers-Pro e...