News Score: Score the News, Sort the News, Rewrite the Headlines

Circuit Tracing: Revealing Computational Graphs in Language Models

Contents Architecture From Cross-Layer Transcoder to Replacement Model The Local Replacement Model Constructing an Attribution Graph for a Prompt Learning from Attribution Graphs Understanding and Labeling Features Grouping Features into Supernodes Validating Attribution Graph Hypotheses with Interventions Localizing Important Layers Factual Recall Case Study Addition Case Study Global Weights in Addition Cross-Layer Transcoder Evaluation Attribution Graph Evaluation Evaluating Mechanistic Faith...

Read more at transformer-circuits.pub

© News Score  score the news, sort the news, rewrite the headlines