Prasanna Mayilvahanan
@prasannamayil.bsky.social
๐ค 26
๐ฅ 35
๐ 8
PhD student in ML at MPI-IS. Prev Apple. Interested in robustness at scale and reasoning.
reposted by
Prasanna Mayilvahanan
Andreas Hochlehnert
7 months ago
CuratedThoughts: Data Curation for RL Datasets ๐ Since DeepSeek-R1 introduced reasoning-based RL, datasets like Open-R1 & OpenThoughts emerged for fine-tuning & GRPO. Our deep dive found major flaws โ 25% of OpenThoughts needed elimination by data curation. Here's why ๐๐งต
1
13
10
New preprint out! ๐ How does LLM training loss translate to downstream performance? We show that pretraining data and tokenizer shape loss-to-loss scaling, while architecture and other factors play a surprisingly minor role!
brendel-group.github.io/llm-line/
๐งต1/8
7 months ago
1
18
10
you reached the end!!
feeds!
log in