SAELens: A library for training and analyzing sparse autoencoders

Bloom, Tigges, Chanin (2024)

Read paper

Tags: open-source, tooling, library

Abstract

SAELens is an open-source library for training, analyzing, and visualizing sparse autoencoders on language model activations.