Prompting by Activation Maximization
Prompt synthesis with activation maximization achieves 95.9% on Yelp Review Polarity (sentiment classificaton task) with Llama-3.2-1B-Instruct and a 4-token prompt, vs. 57% with hand-written prompt. Code on on GitHub.
Motivation
I'm learning PyTorch, and I'm really into activation maximization. I see a lot of potential beyond tricking classifiers and mechanistic intepretation. I've got a project in mind, and I'm building up to it. This experiment is a step on that journey.
Activation Maximizatio...
Read more at joecooper.me