RL boosts LLM reasoning—but why stop at math & code? 🤔
Meet Nemotron-CrossThink—a method to scale RL-based self-learning across law, physics, social science & more.
🔥Resulting in a model that reasons broadly, adapts dynamically, & uses 28% fewer tokens for correct answers!
🧵↓
5 months ago