(@frascuchon.bsky.social)

who's fine-tuning LLMs for reasoning? This dataset has been trending for a few weeks and there's a list of models trained on it. - It has SFT formatted reasoning sequences, like those in o1. - You could incorporate these into post training to boost reasoning abilities.

loading . . .

O1-OPEN/OpenO1-SFT · Datasets at Hugging Face We’re on a journey to advance and democratize artificial intelligence through open source and open science. https://buff.ly/3OP6lgu