News Score: Score the News, Sort the News, Rewrite the Headlines

Scaling Karpathy's Autoresearch: What Happens When the Agent Gets a GPU Cluster

We pointed Claude Code at autoresearch and gave it access to 16 GPUs on a Kubernetes cluster. Over 8 hours it submitted ~910 experiments, found that scaling model width mattered more than any single hyperparameter, taught itself to use H200s for validation while screening ideas on H100s, and drove val_bpb from 1.003 down to 0.974 - a 2.87% improvement over baseline.Beyond raw speedup, parallelism changed how the agent searched. With one GPU, it’s stuck doing greedy hill-climbing - try one thing,...

Read more at blog.skypilot.co

© News Score  score the news, sort the news, rewrite the headlines