The secret sauce for this work is the ReAct style training data preparation: “User-Thought1-Action/API-Observation-Thought2-Response”. We transformed public dialogue datasets into this format for training. Congratulations to @emrecanacikgoz and the @convai_uiuc and Oumi teams!
add a skeleton here at some point
about 1 year ago