Arthur Conmy (@arthurconmy.bsky.social)

Currently, most of our SAE evaluations don't fully capture what we want, and are very expensive This work provides a battery of automated metrics that should help researchers understand their SAEs' strengths and weaknesses better: www.neuronpedia.org/sae-bench/info

loading . . .

SAEBench: A Comprehensive Benchmark for Sparse Autoencoders - Dec 2024 https://www.neuronpedia.org/sae-bench/info

about 1 year ago