Deniz Bayazit (@bayazitdeniz.bsky.social)

1/🚨 New preprint How do #LLMs’ inner features change as they train? Using #crosscoders + a new causal metric, we map when features appear, strengthen, or fade across checkpoints—opening a new lens on training dynamics beyond loss curves & benchmarks. #interpretability