Jacob Springer
@jacobspringer.bsky.social
๐ค 140
๐ฅ 125
๐ 10
Machine Learning (the science part) | PhD student @ CMU
Training with more data = better LLMs, right? ๐จ False! Scaling language models by adding more pre-training data can decrease your performance after post-training! Introducing "catastrophic overtraining." ๐ฅ๐งต๐
arxiv.org/abs/2503.19206
1/10
8 months ago
1
34
15
you reached the end!!
feeds!
log in