Not All Language Model Features Are Linear
Engels, Liao, Michaud, Gurnee, Tegmark (2024)
Tags: representation-geometry, non-linear
Abstract
We demonstrate that some language model features are inherently non-linear, such as circular features for days of the week and months, challenging the linear representation hypothesis.