Willem Röpke (@willemropke.bsky.social)

I think this is the best paper I’ve ever read: arxiv.org/abs/2404.03715 A strong emphasis on theoretically principled algorithms for RLHF followed by motivated practical implementations. Well-written and a clear overview of the relevant background and related work. 10/10 no comments

loading . . .

Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences This paper studies post-training large language models (LLMs) using preference feedback from a powerful oracle to help a model iteratively improve over itself. The typical approach for post-training L... https://arxiv.org/abs/2404.03715

about 1 year ago