A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders
Chanin, Wilken-Smith, Dulka, Bhatnagar, Bloom (2024)
Tags: evaluation, feature-splitting
Abstract
We study feature splitting and absorption phenomena in SAEs, where related concepts are either split across multiple features or absorbed into broader features.