📊 Preliminary ranking of WMT 2025 General Machine Translation benchmark is here!
But don't draw conclusions just yet - automatic metrics are biased for techniques like metric as a reward model or MBR. The official human ranking will be part of General MT findings at WMT.
arxiv.org/abs/2508.14909