In our new paper, "A Comprehensive Evaluation of Multilingual Chain-of-Thought Reasoning: Performance, Consistency, and Faithfulness Across Languages", we go beyond final-answer accuracy to analyze multilingual reasoning along three dimensions: performance, consistency, and faithfulness.
about 1 month ago