Julia Kreutzer (@juliakreutzer.bsky.social)

🍋 Squeezing the most of few samples - check out our LLMonade recipe for few-sample test-time scaling in multitask environments. Turns out that standard methods miss out on gains on non-English languages. We propose more robust alternatives. Very proud of this work that our scholar Ammar led! 🚀

add a skeleton here at some point

about 1 year ago