Leena C Vankadara
@leenacvankadara.bsky.social
π€ 261
π₯ 67
π 11
Lecturer @GatsbyUCL; Previously Applied Scientist @AmazonResearch; PhD @MPI-IS @UniTuebingen
Under He/Lecun inits, theory implies Kernel OR Unstable regimes as widthββ. Discrepancies (e.g. feature learning) are seen as finite width effects. Our
#NeurIPS2025
spotlight refutes this: practical nets do not converge to kernel limits; Feature learning persists as widthββπ§΅
2 months ago
1
7
2
reposted by
Leena C Vankadara
Moritz Haas
about 1 year ago
Stable model scaling with width-independent dynamics? Thrilled to present 2 papers at
#NeurIPS
π that study width-scaling in Sharpness Aware Minimization (SAM) (Th 16:30, #2104) and in Mamba (Fr 11, #7110). Our scaling rules stabilize training and transfer optimal hyperparams across scales. π§΅ 1/10
1
21
5
you reached the end!!
feeds!
log in