A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders

Chanin, Wilken-Smith, Dulka, Bhatnagar, Bloom (2024)

Read paper

Tags: evaluation, feature-splitting

Abstract

We study feature splitting and absorption phenomena in SAEs, where related concepts are either split across multiple features or absorbed into broader features.