Taylor Sorensen
@taylor-sorensen.bsky.social
π€ 227
π₯ 253
π 26
NLP PhD Candidate at UW
reposted by
Taylor Sorensen
Abhilasha Ravichander
6 months ago
Want to know what training data has been memorized by models like GPT-4? We propose information-guided probes, a method to uncover memorization evidence in *completely black-box* models, without requiring access to π ββοΈ Model weights π ββοΈ Training data π ββοΈ Token probabilities π§΅ (1/5)
loading . . .
Information-Guided Identification of Training Data Imprint in (Proprietary) Large Language Models
High-quality training data has proven crucial for developing performant large language models (LLMs). However, commercial LLM providers disclose few, if any, details about the data used for training. ...
https://arxiv.org/abs/2503.12072
4
98
35
π€π€Most AI systems assume thereβs just one right answerβbut many tasks have reasonable disagreement. How can we better model human variation? πβ¨ We propose modeling at the individual-level using open-ended, textual value profiles! π£οΈπ
arxiv.org/abs/2503.15484
7 months ago
1
32
6
Iβll be in Vancouver this weekend for the NeurIPS workshops (go pluralistic alignment!) DMs are open if anyone wants to chat! :)
10 months ago
1
4
0
reposted by
Taylor Sorensen
10 months ago
π Come to our Pluralistic Alignment Workshop at
#NeurIPS2024
! ποΈ December 14 π West Meeting Room 116, 117 Join us to explore pluralistic perspectives in alignment with an incredible lineup of talks and speakers! π Full schedule & details:
pluralistic-alignment.github.io
0
21
7
reposted by
Taylor Sorensen
Amy Zhang
10 months ago
If you are headed to NeurIPS, please join for our Pluralistic Alignment workshop! We have a great set of speakers from a range of backgrounds. & all the papers that will be presented at the workshop are posted:
pluralistic-alignment.github.io
lmk if you'd like to catch up at the conference too! :)
loading . . .
Pluralistic Alignment @ NeurIPS 2024
Pluralistic Alignment
https://pluralistic-alignment.github.io/
1
42
10
you reached the end!!
feeds!
log in