Negative Results for SAEs on Downstream Tasks (GDM Mech Interp Team Progress Update 2)

Smith et al. (DeepMind) (2025)

Read paper

Tags: critical-perspectives, negative-results, deepmind

Abstract

We report negative results when applying SAEs to downstream tasks, finding that SAE features do not consistently improve performance on practical applications.