UW NLP
@uwnlp.bsky.social
📤 603
📥 18
📝 2
The NLP group at the University of Washington.
reposted by
UW NLP
Stella Li
10 months ago
31% of US adults use generative AI for healthcare 🤯But most AI systems answer questions assertively—even when they don’t have the necessary context. Introducing
#MediQ
a framework that enables LLMs to recognize uncertainty🤔and ask the right questions❓when info is missing: 🧵
2
68
16
reposted by
UW NLP
Melanie Sclar
5 months ago
Excited to be at
#ICLR2025
🤩 I'll be giving an oral presentation for Creativity Index on Fri 25th 11:06, Garnet 212&219 🎙️ I'll also be presenting posters: 📍ExploreToM, Sat 26th 10:00, Hall 3 + 2B #49 📍CreativityIndex, Fri 25th 15:00, Hall 3 + 2B #618 Hope to see you there!
0
8
1
reposted by
UW NLP
Kabir Ahuja
5 months ago
📢 New Paper! Tired 😴 of reasoning benchmarks full of math & code? In our work we consider the problem of reasoning for plot holes in stories -- inconsistencies in a storyline that break the internal logic or rules of a story’s world 🌎 W
@melaniesclar.bsky.social
, and
@tsvetshop.bsky.social
1/n
1
10
5
reposted by
UW NLP
Abhilasha Ravichander
6 months ago
Want to know what training data has been memorized by models like GPT-4? We propose information-guided probes, a method to uncover memorization evidence in *completely black-box* models, without requiring access to 🙅♀️ Model weights 🙅♀️ Training data 🙅♀️ Token probabilities 🧵 (1/5)
loading . . .
Information-Guided Identification of Training Data Imprint in (Proprietary) Large Language Models
High-quality training data has proven crucial for developing performant large language models (LLMs). However, commercial LLM providers disclose few, if any, details about the data used for training. ...
https://arxiv.org/abs/2503.12072
4
98
35
reposted by
UW NLP
Hyunwoo Kim
7 months ago
🚨New Paper! So o3-mini and R1 seem to excel on math & coding. But how good are they on other domains where verifiable rewards are not easily available, such as theory of mind (ToM)? Do they show similar behavioral patterns? 🤔 What if I told you it's...interesting, like the below?🧵
3
22
6
reposted by
UW NLP
Abhilasha Ravichander
8 months ago
We are launching HALoGEN💡, a way to systematically study *when* and *why* LLMs still hallucinate. New work w/ Shrusti Ghela*, David Wadden, and Yejin Choi 💫 📝 Paper:
arxiv.org/abs/2501.08292
🚀 Code/Data:
github.com/AbhilashaRav..
. 🌐 Website:
halogen-hallucinations.github.io
🧵 [1/n]
3
34
12
reposted by
UW NLP
Akari Asai
10 months ago
I’m on the academic job market this year! I’m completing my
@uwcse.bsky.social
@uwnlp.bsky.social
Ph.D. (2025), focusing on overcoming LLM limitations like hallucinations, by building new LMs. My Ph.D. work focuses on Retrieval-Augmented LMs to create more reliable AI systems 🧵
3
70
19
reposted by
UW NLP
Jiacheng Liu
10 months ago
Want to predict the task performance of LMs before pretraining them? We develop task scaling laws and model ladders, which predict the accuracy on individual tasks by OLMo 2 7B & 13B models within 2 points of absolute error. The cost is 1% of the compute used to pretrain them.
2
33
14
reposted by
UW NLP
Alisa Liu
10 months ago
excited to be at
#NeurIPS2024
! I'll be presenting our data mixture inference attack 🗓️ Thu 4:30pm w/
@jon.jon.ke
— stop by to learn what trained tokenizers reveal about LLM development (‼️) and chat about all things tokenizers. 🔗
arxiv.org/abs/2407.16607
0
13
4
See our latest work on (among other things) machine text detection through linguistic creativity measurement!
add a skeleton here at some point
10 months ago
0
3
1
reposted by
UW NLP
Akari Asai
11 months ago
1/ Introducing ᴏᴘᴇɴꜱᴄʜᴏʟᴀʀ: a retrieval-augmented LM to help scientists synthesize knowledge 📚
@uwnlp.bsky.social
& Ai2 With open models & 45M-paper datastores, it outperforms proprietary systems & match human experts. Try out our demo!
openscholar.allen.ai
loading . . .
6
161
47
you reached the end!!
feeds!
log in