Lihao Sun
@1e0sun.bsky.social
📤 27
📥 39
📝 11
Working on LLM interpretability; recent graduate from uchicago. slhleosun.github.io
🚨New
#ACL2025
paper! Today’s “safe” language models can look unbiased—but alignment can actually make them more biased implicitly by reducing their sensitivity to race-related associations. 🧵Find out more below!
7 months ago
1
12
3
you reached the end!!
feeds!
log in