Andrew Lee
@ajyl.bsky.social
📤 721
📥 595
📝 41
Post-doc @ Harvard. PhD UMich. Spent time at FAIR and MSR. ML/NLP/Interpretability
Question
@neuripsconf.bsky.social
- a coauthor had his reviews re-assigned many weeks ago. The ACs of those papers told him "i've been told to tell u: leave a short note. You won't be penalized". Now I'm being warned of desk-reject due to his short/poor reviews. What's the right protocol here?
3 months ago
0
0
0
reposted by
Andrew Lee
4 months ago
How do language models track mental states of each character in a story, often referred to as Theory of Mind? We reverse-engineered how LLaMA-3-70B-Instruct handles a belief-tracking task and found something surprising: it uses mechanisms strikingly similar to pointer variables in C programming!
2
58
20
reposted by
Andrew Lee
Lihao Sun
4 months ago
🚨New
#ACL2025
paper! Today’s “safe” language models can look unbiased—but alignment can actually make them more biased implicitly by reducing their sensitivity to race-related associations. 🧵Find out more below!
1
12
3
🚨New preprint! How do reasoning models verify their own CoT? We reverse-engineer LMs and find critical components and subspaces needed for self-verification! 1/n
5 months ago
1
16
3
🚨New Preprint! Did you know that steering vectors from one LM can be transferred and re-used in another LM? We argue this is because token embeddings across LMs share many “global” and “local” geometric similarities!
5 months ago
3
63
16
reposted by
Andrew Lee
David Bau
8 months ago
Today we launch a new open research community It is called ARBOR:
arborproject.github.io/
please join us.
bsky.app/profile/ajy...
1
15
7
Excited about recent reasoning models? What is happening under the hood? Join ARBOR: Analysis of Reasoning Behaviors thru *Open Research* - a radically open collaboration to reverse-engineer reasoning models! Learn more:
arborproject.github.io
1/N
loading . . .
ARBOR
https://arborproject.github.io/
8 months ago
1
13
3
New paper <3 Interested in inference-time scaling? In-context Learning? Mech Interp? LMs can solve novel in-context tasks, with sufficient examples (longer contexts). Why? Bc they dynamically form *in-context representations*! 1/N
9 months ago
3
52
17
you reached the end!!
feeds!
log in