Evaluating Open-Source Sparse Autoencoders on Disentangling Factual Knowledge in GPT-2 Small

Chaudhary & Geiger (2024)

Tags: evaluation, factual-knowledge

Abstract

We evaluate open-source SAEs on their ability to disentangle factual knowledge representations in GPT-2 Small.