Jirui Qi
@jiruiqi.bsky.social
๐ค 85
๐ฅ 60
๐ 32
Ph.D Candidate @GroNLP, University of Groningen
#NLProc
https://betswish.github.io
pinned post!
[1/]๐กNew Paper Large reasoning models (LRMs) are strong in English โ but how well do they reason in your language? Our latest work uncovers their limitation and a clear trade-off: Controlling Thinking Trace Language Comes at the Cost of Accuracy ๐Link:
arxiv.org/abs/2505.22888
about 1 year ago
1
8
8
Excited to kick off a 3-month research visit at Rycolab (ETH Zurich)! ๐จ๐ญ My research focuses on RL, alignment, multilingual LMs, reasoning, and RAG. If you're exploring any of these areas, feel free to reach out or say hi!
#NLP
#RL
#AIAlignment
#Multilinguality
3 months ago
0
6
0
reposted by
Jirui Qi
Arianna Bisazza
7 months ago
InCLow topics
#EMNLP2025
: - MT error prediction techniques & its reception by professional translators (
@gsarti.com
) - thinking language in Large Reasoning Models (
@jiruiqi.bsky.social
) - effect of stereotypes on LLMโs implicit personalization (
@veraneplenbroek.bsky.social
) ....
1
5
1
Our paper on multilingual reasoning is accepted to Findings of
#EMNLP2025
! ๐ (OA: 3/3/3.5/4) We show SOTA LMs struggle with reasoning in non-English languages; prompt-hack & post-training improve alignment but trade off accuracy. ๐
arxiv.org/abs/2505.22888
See you in Suzhou!
#EMNLP
add a skeleton here at some point
10 months ago
0
7
3
reposted by
Jirui Qi
Gabriele Sarti
about 1 year ago
๐ข New paper: Can unsupervised metrics extracted from MT models detect their translation errors reliably? Do annotators even *agree* on what constitutes an error? ๐ง We compare uncertainty- and interp-based WQE metrics across 12 directions, with some surprising findings! ๐งต 1/
1
16
5
reposted by
Jirui Qi
Francesca Padovani
about 1 year ago
โChild-Directed Language Does Not Consistently Boost Syntax Learning in Language Modelsโ Iโm happy to share that the preprint of my first PhD project is now online! ๐ Paper:
arxiv.org/abs/2505.23689
loading . . .
Child-Directed Language Does Not Consistently Boost Syntax Learning in Language Models
Seminal work by Huebner et al. (2021) showed that language models (LMs) trained on English Child-Directed Language (CDL) can reach similar syntactic abilities as LMs trained on much larger amounts of ...
https://arxiv.org/abs/2505.23689
2
61
20
[1/]๐กNew Paper Large reasoning models (LRMs) are strong in English โ but how well do they reason in your language? Our latest work uncovers their limitation and a clear trade-off: Controlling Thinking Trace Language Comes at the Cost of Accuracy ๐Link:
arxiv.org/abs/2505.22888
about 1 year ago
1
8
8
โจ New Paper โจ [1/] Retrieving passages from many languages can boost retrieval augmented generation (RAG) performance, but how good are LLMs at dealing with multilingual contexts in the prompt? ๐ Check it out:
arxiv.org/abs/2504.00597
(w/
@arianna-bis.bsky.social
@Raquel_Fernรกndez)
#NLProc
about 1 year ago
1
4
6
๐ First post on Blue: Our paper on **efficient prompt engineering** has been accepted by NAACL2025 Main Conference! ๐ Key Point: LLMs tend to generate better responses when the likelihood of the question segment is higher. I.e. p(question) โ Performance Paper available at:
arxiv.org/abs/2411.07773
loading . . .
Likelihood as a Performance Gauge for Retrieval-Augmented Generation
Recent work finds that retrieval-augmented generation with large language models is prone to be influenced by the order of retrieved documents in the context. However, the lack of in-depth analysis li...
https://arxiv.org/abs/2411.07773
over 1 year ago
1
2
0
you reached the end!!
feeds!
log in