Not All Language Model Features Are Linear

Engels, Liao, Michaud, Gurnee, Tegmark (2024)

Read paper

Tags: representation-geometry, non-linear

Abstract

We demonstrate that some language model features are inherently non-linear, such as circular features for days of the week and months, challenging the linear representation hypothesis.