Joschka Strüber @ICML2025 🇨🇦
@joschkastrueber.bsky.social
📤 77
📥 321
📝 9
PhD student at the University of Tübingen, member of
@bethgelab.bsky.social
pinned post!
🚨Great Models Think Alike and this Undermines AI Oversight🚨 New paper quantifies LM similarity (1) LLM-as-a-judge favor more similar models🤥 (2) Complementary knowledge benefits Weak-to-Strong Generalization☯️ (3) More capable models have more correlated failures 📈🙀 🧵👇
8 months ago
2
20
10
reposted by
Joschka Strüber @ICML2025 🇨🇦
Federico D’Agostino
6 months ago
🚨 New paper alert! 🚨 We’ve just launched openretina, an open-source framework for collaborative retina modeling across datasets and species. A 🧵👇 (1/9)
1
38
21
reposted by
Joschka Strüber @ICML2025 🇨🇦
7 months ago
AI can generate correct-seeming hypotheses (and papers!). Brandolini's law states BS is harder to refute than generate. Can LMs falsify incorrect solutions? o3-mini (high) scores just 9% on our new benchmark REFUTE. Verification is not necessarily easier than generation 🧵
1
4
3
reposted by
Joschka Strüber @ICML2025 🇨🇦
Prasanna Mayilvahanan
7 months ago
New preprint out! 🎉 How does LLM training loss translate to downstream performance? We show that pretraining data and tokenizer shape loss-to-loss scaling, while architecture and other factors play a surprisingly minor role!
brendel-group.github.io/llm-line/
🧵1/8
1
18
10
reposted by
Joschka Strüber @ICML2025 🇨🇦
Andreas Hochlehnert
7 months ago
CuratedThoughts: Data Curation for RL Datasets 🚀 Since DeepSeek-R1 introduced reasoning-based RL, datasets like Open-R1 & OpenThoughts emerged for fine-tuning & GRPO. Our deep dive found major flaws — 25% of OpenThoughts needed elimination by data curation. Here's why 👇🧵
1
13
10
reposted by
Joschka Strüber @ICML2025 🇨🇦
Wieland Brendel
7 months ago
🚀 We’re hiring! Join Bernhard Schölkopf & me at
@ellisinsttue.bsky.social
to push the frontier of
#AI
in education! We’re building cutting-edge, open-source AI tutoring models for high-quality, adaptive learning for all pupils with support from the Hector Foundation. 👉
forms.gle/sxvXbJhZSccr...
1
8
15
🚨Great Models Think Alike and this Undermines AI Oversight🚨 New paper quantifies LM similarity (1) LLM-as-a-judge favor more similar models🤥 (2) Complementary knowledge benefits Weak-to-Strong Generalization☯️ (3) More capable models have more correlated failures 📈🙀 🧵👇
8 months ago
2
20
10
you reached the end!!
feeds!
log in