Jirui Qi
@jiruiqi.bsky.social
📤 82
📥 59
📝 31
Ph.D Candidate @GroNLP, University of Groningen
#NLProc
https://betswish.github.io
pinned post!
[1/]💡New Paper Large reasoning models (LRMs) are strong in English — but how well do they reason in your language? Our latest work uncovers their limitation and a clear trade-off: Controlling Thinking Trace Language Comes at the Cost of Accuracy 📄Link:
arxiv.org/abs/2505.22888
6 months ago
1
8
8
reposted by
Jirui Qi
Arianna Bisazza #EMNLP
18 days ago
InCLow topics
#EMNLP2025
: - MT error prediction techniques & its reception by professional translators (
@gsarti.com
) - thinking language in Large Reasoning Models (
@jiruiqi.bsky.social
) - effect of stereotypes on LLM’s implicit personalization (
@veraneplenbroek.bsky.social
) ....
1
5
1
Our paper on multilingual reasoning is accepted to Findings of
#EMNLP2025
! 🎉 (OA: 3/3/3.5/4) We show SOTA LMs struggle with reasoning in non-English languages; prompt-hack & post-training improve alignment but trade off accuracy. 📄
arxiv.org/abs/2505.22888
See you in Suzhou!
#EMNLP
add a skeleton here at some point
3 months ago
0
7
3
reposted by
Jirui Qi
Gabriele Sarti
6 months ago
📢 New paper: Can unsupervised metrics extracted from MT models detect their translation errors reliably? Do annotators even *agree* on what constitutes an error? 🧐 We compare uncertainty- and interp-based WQE metrics across 12 directions, with some surprising findings! 🧵 1/
1
16
5
reposted by
Jirui Qi
Francesca Padovani
6 months ago
“Child-Directed Language Does Not Consistently Boost Syntax Learning in Language Models” I’m happy to share that the preprint of my first PhD project is now online! 🎊 Paper:
arxiv.org/abs/2505.23689
loading . . .
Child-Directed Language Does Not Consistently Boost Syntax Learning in Language Models
Seminal work by Huebner et al. (2021) showed that language models (LMs) trained on English Child-Directed Language (CDL) can reach similar syntactic abilities as LMs trained on much larger amounts of ...
https://arxiv.org/abs/2505.23689
2
62
20
[1/]💡New Paper Large reasoning models (LRMs) are strong in English — but how well do they reason in your language? Our latest work uncovers their limitation and a clear trade-off: Controlling Thinking Trace Language Comes at the Cost of Accuracy 📄Link:
arxiv.org/abs/2505.22888
6 months ago
1
8
8
✨ New Paper ✨ [1/] Retrieving passages from many languages can boost retrieval augmented generation (RAG) performance, but how good are LLMs at dealing with multilingual contexts in the prompt? 📄 Check it out:
arxiv.org/abs/2504.00597
(w/
@arianna-bis.bsky.social
@Raquel_Fernández)
#NLProc
7 months ago
1
4
6
🎉 First post on Blue: Our paper on **efficient prompt engineering** has been accepted by NAACL2025 Main Conference! 🎉 Key Point: LLMs tend to generate better responses when the likelihood of the question segment is higher. I.e. p(question) ∝ Performance Paper available at:
arxiv.org/abs/2411.07773
loading . . .
Likelihood as a Performance Gauge for Retrieval-Augmented Generation
Recent work finds that retrieval-augmented generation with large language models is prone to be influenced by the order of retrieved documents in the context. However, the lack of in-depth analysis li...
https://arxiv.org/abs/2411.07773
10 months ago
1
2
0
you reached the end!!
feeds!
log in