New AGI Test Stumps AI Giants: Arc Prize Foundation's ARC-AGI-2 Challenges Models with 1% Scores, While Humans Average 60%

A new, challenging AGI test stumps most AI models | TechCrunch

The Arc Prize Foundation, a nonprofit co-founded by prominent AI researcher François Chollet, announced in a blog post on Monday that it has created a new, challenging test to measure the general intelligence of leading AI models. So far, the new test, called ARC-AGI-2, has stumped most models. “Reasoning” AI models like OpenAI’s o1-pro and DeepSeek’s R1 score between 1% and 1.3% on ARC-AGI-2, according to the Arc Prize leaderboard. Powerful non-reasoning models, including GPT-4.5, Claude 3.7 So...