Daniel Marczak
@dmarczak.bsky.social
📤 52
📥 86
📝 7
mostly trying to merge models | phd student @ warsaw university of technology & ideas
pinned post!
🚀 What happens when you modify the spectrum of singular values of the merged task vector? 🤔 Apparently, you achieve 🚨state-of-the-art🚨 model merging results! 🔥 ✨ Introducing “No Task Left Behind: Isotropic Model Merging with Common and Task-Specific Subspaces”
about 1 year ago
1
6
4
🚀 What happens when you modify the spectrum of singular values of the merged task vector? 🤔 Apparently, you achieve 🚨state-of-the-art🚨 model merging results! 🔥 ✨ Introducing “No Task Left Behind: Isotropic Model Merging with Common and Task-Specific Subspaces”
about 1 year ago
1
6
4
reposted by
Daniel Marczak
Marcin Przewięźlikowski
about 1 year ago
Self-supervised Learning with Masked Autoencoders (MAE) is known to produce worse image representations than Joint-Embedding approaches (e.g. DINO). In our new paper, we identify new reasons for why that is and point towards solutions:
arxiv.org/abs/2412.03215
🧵
1
14
4
you reached the end!!
feeds!
log in