On the Biology of a Large Language Model (Attribution Graphs)

Lindsey, Gurnee, Ameisen et al. (Anthropic) (2025)

Tags: circuits, attribution, anthropic

Abstract

We introduce attribution graphs built on SAE features to trace the computational pathways in Claude, revealing biological-like organizational principles in large language models.