Tim Baumgรคrtner
@timbmg.bsky.social
๐ค 91
๐ฅ 269
๐ 10
๐จโ๐ป NLP PhD Student
@ukplab.bsky.social
pinned post!
"An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship." Claerbout, 1992 Decades later, code is routinely released with papers. But does the code match the ad?
about 2 months ago
1
0
0
reposted by
Tim Baumgรคrtner
ACL Rolling Review (ARR)
about 1 month ago
๐ข ARR-May reviewers can now try REVAS, an experimental review support tool. REVAS gives feedback on review quality criteria and ARR reviewer heuristics, but does not suggest review content or scores. ๐
aclrollingreview.org/revas-may26
#ARR
#EMNLP
#ACL
#NLProc
loading . . .
Using the REVAS review assistant tool in the ARR-May cycle
In the March 2026 ARR cycle, we for the first time encouraged reviewers to try the experimental review support tool called REVAS. The instructions were communicated to the cycle reviewers, but we real...
https://aclrollingreview.org/revas-may26
0
5
2
"An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship." Claerbout, 1992 Decades later, code is routinely released with papers. But does the code match the ad?
about 2 months ago
1
0
0
REVAS is a peer review assistant I've been working on with colleagues at MBZUAI. The idea is simple: better reviews make better science. If you're reviewing for ARR this cycle, give it a try.
add a skeleton here at some point
2 months ago
0
0
0
๐ก TIL
@overleaf.com
is basically a git repo. In my research workflow, I directly added it as submodule to my code repo. Now I can produce figures and tables, and have them magically uploaded to Overleaf just by pushing the repo. No more renaming, keeping versions straight, and manual uploading ๐
8 months ago
0
0
0
๐ก TIL, it's super easy to fetch data from Google Sheets into Pandas. Makes it really convenient to annotate some data. Previously, I was always downloading CSVs, losing track of file versions, and loading and merging them sluggishly in Python. ๐ find the code here:
gist.github.com/timbmg/6c2d6...
8 months ago
0
0
0
reposted by
Tim Baumgรคrtner
UKP Lab
about 1 year ago
๐ ๐ช๐ฎ๐ป๐ ๐๐ผ ๐ฒ๐๐ฎ๐น๐๐ฎ๐๐ฒ ๐บ๐ผ๐ฑ๐ฒ๐น๐ ๐ผ๐ป ๐๐ฐ๐ถ๐ฒ๐ป๐๐ถ๐ณ๐ถ๐ฐ ๐ค๐, ๐ฏ๐๐ ๐๐ผ๐๐ฟ ๐ฑ๐ฎ๐๐ฎ๐๐ฒ๐ ๐น๐ฎ๐ฐ๐ธ๐ ๐ฟ๐ฒ๐ฎ๐น-๐๐ผ๐ฟ๐น๐ฑ ๐พ๐๐ฒ๐๐๐ถ๐ผ๐ป๐ ๐ฎ๐๐ธ๐ฒ๐ฑ ๐ฏ๐ ๐ฒ๐ ๐ฝ๐ฒ๐ฟ๐๐? ๐ PeerQA is the solution: a dataset with questions from peer reviews and answers from the original authors. (1/๐งต)
#NLProc
1
2
1
reposted by
Tim Baumgรคrtner
Martin Tutek
over 1 year ago
๐จ๐จ New preprint ๐จ๐จ Ever wonder whether verbalized CoTs correspond to the internal reasoning process of the model? We propose a novel parametric faithfulness approach, which erases information contained in CoT steps from the model parameters to assess CoT faithfulness.
arxiv.org/abs/2502.14829
loading . . .
Measuring Faithfulness of Chains of Thought by Unlearning Reasoning Steps
When prompted to think step-by-step, language models (LMs) produce a chain of thought (CoT), a sequence of reasoning steps that the model supposedly used to produce its prediction. However, despite mu...
https://arxiv.org/abs/2502.14829
2
48
16
reposted by
Tim Baumgรคrtner
UKP Lab
over 1 year ago
๐๐ฎ๐ฐ๐-๐๐ต๐ฒ๐ฐ๐ธ๐ถ๐ป๐ด ๐ถ๐ป ๐๐ต๐ฒ ๐๐ด๐ฒ ๐ผ๐ณ ๐๐ โ ๐ ๐ง๐ฎ๐น๐ธ ๐ฏ๐ ๐๐ฟ๐๐ป๐ฎ ๐๐๐ฟ๐ฒ๐๐๐ฐ๐ต @๐๐ ๐ณ๐ผ๐ฟ ๐๐ผ๐ผ๐ฑ Misinformation is a new weapon disrupting public debates, scientific discussions, and political decisions. How can we identify and counter misleading content? (1/๐งต)
loading . . .
Towards real-world fact-checking with large language models
Misinformation poses a growing threat to our society. It has a severe impact on public health by promoting fake cures fear and distrust. Current research
https://aiforgood.itu.int/event/towards-real-world-fact-checking-with-large-language-models/
1
3
1
reposted by
Tim Baumgรคrtner
Florent Daudens
over 1 year ago
๐ค An Energy Star for AI? Introducing AI Energy Score: First-ever rating system comparing 166 AI models' energy consumption! From LLaMa to Gemma, get transparent โญ๏ธ1-5 efficiency ratings. Incredible work led by
@sashamtl.bsky.social
huggingface.co/blog/sasha/a...
0
25
8
Excited to share that our Paper "PeerQA: A Scientific Question Answering Dataset from Peer Reviews" as been accepted to
#NAACL2025
Looking forward to presenting it in Albuquerque ๐๏ธ!
add a skeleton here at some point
over 1 year ago
0
5
0
reposted by
Tim Baumgรคrtner
Bertram Hรธjer
over 1 year ago
What do YOU mean by "intelligence", and does ChatGPT fit your definition? We collected the major criteria used in CogSci and other fields, and designed a survey to find out! Access link:
www.survey-xact.dk/collect
Code: 4S7V-SN4M-S536 Time: 5-10 mins
loading . . .
Perspectives on Intelligence: Community Survey
Research survey exploring how NLP/ML/CogSci researchers define and use the concept of intelligence.
https://bertramhojer.github.io/projects/intelligence-survey/
2
32
23
you reached the end!!
feeds!
log in