andrea wang
@andreawwenyi.bsky.social
📤 1867
📥 56
📝 15
phd @ cornell infosci
https://andreawwenyi.github.io
reposted by
andrea wang
Jennah Gosciak
3 months ago
I am presenting a new 📝 “Bias Delayed is Bias Denied? Assessing the Effect of Reporting Delays on Disparity Assessments” at
@facct.bsky.social
on Thursday, with
@aparnabee.bsky.social
, Derek Ouyang,
@allisonkoe.bsky.social
,
@marzyehghassemi.bsky.social
, and Dan Ho. 🔗:
arxiv.org/abs/2506.13735
(1/n)
1
13
7
reposted by
andrea wang
Emma Harvey
3 months ago
I am so excited to be in 🇬🇷Athens🇬🇷 to present "A Framework for Auditing Chatbots for Dialect-Based Quality-of-Service Harms" by me,
@kizilcec.bsky.social
, and
@allisonkoe.bsky.social
, at
#FAccT2025
!! 🔗:
arxiv.org/pdf/2506.04419
1
31
12
reposted by
andrea wang
John Garrison Marks
4 months ago
Worth noting today that the entire budget of the NEH is about $200M.
add a skeleton here at some point
6
420
236
reposted by
andrea wang
David Mimno
4 months ago
New NEH-supported tutorial on running LLMs locally with ollama! Your laptop is more powerful than you think. Save money, privacy, and energy.
aiforhumanists.com/tutorials/
loading . . .
Code Tutorials
The AI for Humanists project is developing resources to enable DH scholars to explore how large language models and AI technologies can be used in their research and teaching. Find an annotated biblio...
https://aiforhumanists.com/tutorials/
3
63
25
reposted by
andrea wang
Lucy Li
5 months ago
I'm joining Wisconsin CS as an assistant professor in fall 2026!! There, I'll continue working on language models, computational social science, & responsible AI. 🌲🧀🚣🏻♀️ Apply to be my PhD student! Before then, I'll postdoc for a year in the NLP group at another UW 🏔️ in the Pacific Northwest
16
145
17
reposted by
andrea wang
Alex Gil
5 months ago
For the HTR and OCR crew: New paper by Jonathan Bourne. He's been working to help DLOC handle OCR for a whole bunch of Caribbean historical newspapers. "Scrambled text: fine-tuning language models for OCR error correction using synthetic data"
link.springer.com/article/10.1...
loading . . .
Scrambled text: fine-tuning language models for OCR error correction using synthetic data - International Journal on Document Analysis and Recognition (IJDAR)
OCR errors are common in digitised historical archives significantly affecting their usability and value. Generative Language Models (LMs) have shown potential for correcting these errors using the co...
https://link.springer.com/article/10.1007/s10032-025-00522-0
4
44
17
reposted by
andrea wang
Maria Antoniak
5 months ago
Slightly paraphrasing
@oms279.bsky.social
during his talk at
#COMPTEXT2025
: "The single most important use case for LLMs in sociology is turning unstructured data into structured data." Discussing his recent work on codebooks, prompts, and information extraction:
osf.io/preprints/so...
2
29
5
reposted by
andrea wang
Simona Liao
10 months ago
Hi everyone, I am excited to share our large-scale survey study with 800+ researchers, which reveals researchers’ usage and perceptions of LLMs as research tools, and how the usage and perceptions differ based on demographics. See results in comments! 🔗 Arxiv link:
arxiv.org/abs/2411.05025
loading . . .
LLMs as Research Tools: A Large Scale Survey of Researchers' Usage and Perceptions
The rise of large language models (LLMs) has led many researchers to consider their usage for scientific work. Some have found benefits using LLMs to augment or automate aspects of their research pipe...
https://arxiv.org/abs/2411.05025
9
98
34
[New preprint!] Do Chinese AI Models Speak Chinese Languages? Not really. Chinese LLMs like DeepSeek are better at French than Cantonese. Joint work with Unso Jo and
@dmimno.bsky.social
. Link to paper:
arxiv.org/pdf/2504.00289
🧵
6 months ago
1
25
6
reposted by
andrea wang
Sung Kim
6 months ago
You’ve probably heard about how AI/LLMs can solve Math Olympiad problems (
deepmind.google/discover/blo...
). So naturally, some people put it to the test — hours after the 2025 US Math Olympiad problems were released. The result: They all sucked!
9
175
63
reposted by
andrea wang
travis lloyd (træve)
6 months ago
*NEW DATASET AND PAPER* (CHI2025): How are online communities responding to AI-generated content (AIGC)? We study this by collecting and analyzing the public rules of 300,000+ subreddits in 2023 and 2024. 1/
1
16
7
reposted by
andrea wang
Dr. Casey Fiesler
7 months ago
hey it's that time of year again, when people start to wonder whether AIES is actually happening and when this year’s paper deadline might be if so! anyone know anything about the ACM/AAAI conference on AI Ethics & Society for 2025? (I used to ask about this every year on Twitter haha.)
1
21
7
reposted by
andrea wang
David Mimno
11 months ago
Best Student Paper at
#AIES
2024 went to
@andreawwenyi.bsky.social
! Annotating gender-biased narratives in the courtroom is a complex, nuanced task with frequent subjective decision-making by legal experts. We asked: What do experts desire from a language model in this annotation process?
1
19
5
How do LLMs represent relationships between languages? By studying the embedding layers of XLM-R and mT5, we find they are highly interpretable. LLMs can find semantic alignment as an emergent property! Joint work with
@dmimno.bsky.social
. 🧵
over 1 year ago
2
26
4
you reached the end!!
feeds!
log in