AutoThink: New Technique Boosts Local LLM Performance by 43%, Adaptively Allocates Resources for Efficient AI Reasoning

Show HN: AutoThink – Boosts local LLM performance by 43% with adaptive reasoning

I built AutoThink, a technique that makes local LLMs reason more efficiently by adaptively allocating computational resources based on query complexity.The core idea: instead of giving every query the same "thinking time," classify queries as HIGH or LOW complexity and allocate thinking tokens accordingly. Complex reasoning gets 70-90% of tokens, simple queries get 20-40%.I also implemented steering vectors derived from Pivotal Token Search (originally from Microsoft's Phi-4 paper) that guide th...