Lee Sharkey
@leesharkey.bsky.social
📤 667
📥 108
📝 12
Scruting matrices @ Apollo Research
New interpretability paper from Apollo Research! 🟢Attribution-based Parameter Decomposition 🟢 It's a new way to decompose neural network parameters directly into mechanistic components. It overcomes many of the issues with SAEs! 🧵
over 1 year ago
1
17
3
reposted by
Lee Sharkey
Laura
over 1 year ago
To my surprise, we find the opposite of what I thought when we started this project: The approach to reasoning LLMs use looks unlike retrieval, and more like a generalisable strategy synthesising procedural knowledge from many documents doing a similar form of reasoning.
3
106
18
you reached the end!!
feeds!
log in