Showing SAE Latents Are Not Atomic Using Meta-SAEs
Bussmann, Pearce, Leask, Bloom, Sharkey, Nanda (2024)
Tags: representation-geometry, meta-sae
Abstract
We train meta-SAEs on top of SAE latents, showing that individual SAE features can be further decomposed, suggesting they are not the atomic units of representation.