Toy Models of Superposition
Elhage, Hume, Olsson et al. (Anthropic) (2022)
Tags: foundations, superposition, anthropic
Abstract
We use toy models to investigate superposition, a phenomenon where neural networks represent more features than they have dimensions by encoding features in overlapping combinations of neurons.