UW NLP
@uwnlp.bsky.social
📤 616
📥 18
📝 2
The NLP group at the University of Washington.
reposted by
UW NLP
Stella Li
about 1 year ago
31% of US adults use generative AI for healthcare 🤯But most AI systems answer questions assertively—even when they don’t have the necessary context. Introducing
#MediQ
a framework that enables LLMs to recognize uncertainty🤔and ask the right questions❓when info is missing: 🧵
2
68
16
reposted by
UW NLP
Melanie Sclar
9 months ago
Excited to be at
#ICLR2025
🤩 I'll be giving an oral presentation for Creativity Index on Fri 25th 11:06, Garnet 212&219 🎙️ I'll also be presenting posters: 📍ExploreToM, Sat 26th 10:00, Hall 3 + 2B #49 📍CreativityIndex, Fri 25th 15:00, Hall 3 + 2B #618 Hope to see you there!
0
8
1
reposted by
UW NLP
Kabir Ahuja
9 months ago
📢 New Paper! Tired 😴 of reasoning benchmarks full of math & code? In our work we consider the problem of reasoning for plot holes in stories -- inconsistencies in a storyline that break the internal logic or rules of a story’s world 🌎 W
@melaniesclar.bsky.social
, and
@tsvetshop.bsky.social
1/n
1
10
5
reposted by
UW NLP
Abhilasha Ravichander
10 months ago
Want to know what training data has been memorized by models like GPT-4? We propose information-guided probes, a method to uncover memorization evidence in *completely black-box* models, without requiring access to 🙅♀️ Model weights 🙅♀️ Training data 🙅♀️ Token probabilities 🧵 (1/5)
loading . . .
Information-Guided Identification of Training Data Imprint in (Proprietary) Large Language Models
High-quality training data has proven crucial for developing performant large language models (LLMs). However, commercial LLM providers disclose few, if any, details about the data used for training. ...
https://arxiv.org/abs/2503.12072
4
97
35
reposted by
UW NLP
Hyunwoo Kim
11 months ago
🚨New Paper! So o3-mini and R1 seem to excel on math & coding. But how good are they on other domains where verifiable rewards are not easily available, such as theory of mind (ToM)? Do they show similar behavioral patterns? 🤔 What if I told you it's...interesting, like the below?🧵
3
22
6
reposted by
UW NLP
Abhilasha Ravichander
11 months ago
We are launching HALoGEN💡, a way to systematically study *when* and *why* LLMs still hallucinate. New work w/ Shrusti Ghela*, David Wadden, and Yejin Choi 💫 📝 Paper:
arxiv.org/abs/2501.08292
🚀 Code/Data:
github.com/AbhilashaRav..
. 🌐 Website:
halogen-hallucinations.github.io
🧵 [1/n]
3
34
12
reposted by
UW NLP
Akari Asai
about 1 year ago
I’m on the academic job market this year! I’m completing my
@uwcse.bsky.social
@uwnlp.bsky.social
Ph.D. (2025), focusing on overcoming LLM limitations like hallucinations, by building new LMs. My Ph.D. work focuses on Retrieval-Augmented LMs to create more reliable AI systems 🧵
3
71
19
reposted by
UW NLP
Jiacheng Liu
about 1 year ago
Want to predict the task performance of LMs before pretraining them? We develop task scaling laws and model ladders, which predict the accuracy on individual tasks by OLMo 2 7B & 13B models within 2 points of absolute error. The cost is 1% of the compute used to pretrain them.
2
33
14
reposted by
UW NLP
Alisa Liu
about 1 year ago
excited to be at
#NeurIPS2024
! I'll be presenting our data mixture inference attack 🗓️ Thu 4:30pm w/
@jon.jon.ke
— stop by to learn what trained tokenizers reveal about LLM development (‼️) and chat about all things tokenizers. 🔗
arxiv.org/abs/2407.16607
0
13
4
See our latest work on (among other things) machine text detection through linguistic creativity measurement!
add a skeleton here at some point
about 1 year ago
0
3
1
reposted by
UW NLP
Akari Asai
about 1 year ago
1/ Introducing ᴏᴘᴇɴꜱᴄʜᴏʟᴀʀ: a retrieval-augmented LM to help scientists synthesize knowledge 📚
@uwnlp.bsky.social
& Ai2 With open models & 45M-paper datastores, it outperforms proprietary systems & match human experts. Try out our demo!
openscholar.allen.ai
loading . . .
6
161
47
you reached the end!!
feeds!
log in