Improving Dictionary Learning with Gated Sparse Autoencoders

Rajamanoharan, Conmy, Smith et al. (DeepMind) (2024)

Tags: architecture, gated-sae, deepmind

Abstract

We introduce Gated SAEs which use a gating mechanism to separate the detection of which features are active from the estimation of their magnitudes, improving reconstruction fidelity.