Sparse Autoencoders (SAEs) are popular, with 10+ new approaches proposed in the last year. How do we know if we are making progress? The field has relied on imperfect proxy metrics.
We are releasing SAE Bench, a suite of 8 SAE evaluations!
Project co-led with Adam Karvonen.
10 months ago