Weijie Su
@wjsu.bsky.social
📤 172
📥 31
📝 17
Associate Professor at University of Pennsylvania
We're excited to announce the call for papers for
#ICML
2026:
icml.cc/Conferences/...
See you in Seoul next summer!
add a skeleton here at some point
25 days ago
0
1
1
Great minds think alike! Alan Turing cracked Enigma in WWII; Brad Efron asked how many words Shakespeare knew. They used the same method. We use this method for LLM evaluation—to evaluate certain unseen capabilities of LLMs:
arxiv.org/abs/2506.02058
loading . . .
Evaluating the Unseen Capabilities: How Many Theorems Do LLMs Know?
Accurate evaluation of large language models (LLMs) is crucial for understanding their capabilities and guiding their development. However, current evaluations often inconsistently reflect the actual ...
https://arxiv.org/abs/2506.02058
6 months ago
0
1
1
A (not so) new paper on
#LLM
alignment from a social choice theory viewpoint:
arxiv.org/abs/2503.10990
It reveals fundamental impossibility results concerning representing (diverse) human preferences.
loading . . .
Statistical Impossibility and Possibility of Aligning LLMs with Human Preferences: From Condorcet Paradox to Nash Equilibrium
Aligning large language models (LLMs) with diverse human preferences is critical for ensuring fairness and informed outcomes when deploying these models for decision-making. In this paper, we seek to ...
https://arxiv.org/abs/2503.10990
6 months ago
1
1
1
We posted a paper on optimization for deep learning:
arxiv.org/abs/2505.21799
Recently there's a surge of interest in *structure-aware* optimizers: Muon, Shampoo, Soap. In this paper, we propose a unifying preconditioning perspective, offer insights into these matrix-gradient methods.
loading . . .
PolarGrad: A Class of Matrix-Gradient Optimizers from a Unifying Preconditioning Perspective
The ever-growing scale of deep learning models and datasets underscores the critical importance of efficient optimization methods. While preconditioned gradient methods such as Adam and AdamW are the ...
https://arxiv.org/abs/2505.21799
6 months ago
1
3
1
I just wrote a position paper on the relation between statistics and large language models: Do Large Language Models (Really) Need Statistical Foundations?
arxiv.org/abs/2505.19145
Any comments are welcome. Thx!
loading . . .
Do Large Language Models (Really) Need Statistical Foundations?
Large language models (LLMs) represent a new paradigm for processing unstructured data, with applications across an unprecedented range of domains. In this paper, we address, through two arguments, wh...
https://arxiv.org/abs/2505.19145
6 months ago
1
1
0
Our paper "The ICML 2023 Ranking Experiment: Examining Author Self-Assessment in ML/AI Peer Review" will appear in JASA as a Discussion Paper:
arxiv.org/abs/2408.13430
It's a privilege to work with such a wonderful team: Buxin, Jiayao, Natalie, Yuling, Didong, Kyunghyun, Jianqing, and Aaroth.
loading . . .
The ICML 2023 Ranking Experiment: Examining Author Self-Assessment in ML/AI Peer Review
We conducted an experiment during the review process of the 2023 International Conference on Machine Learning (ICML), asking authors with multiple submissions to rank their papers based on perceived q...
https://arxiv.org/abs/2408.13430
6 months ago
1
1
1
We're hiring a postdoc focused on the statistical foundations of large language models, starting this fall. Join our team exploring the theoretical and statistical underpinnings of LLMs. If interested, check our work:
weijie-su.com/llm/
and drop me an email.
#AIResearch
#PostdocPosition
loading . . .
Statistical Foundations of Large Language Models
http://weijie-su.com/llm/
7 months ago
0
1
1
reposted by
Weijie Su
Gautam Kamath
7 months ago
I wrote a post on how to connect with people (i.e., make friends) at CS conferences. These events can be intimidating so here's some suggestions on how to navigate them I'm late for
#ICLR2025
#NAACL2025
, but in time for
#AISTATS2025
#ICML2025
! 1/3
kamathematics.wordpress.com/2025/05/01/t...
loading . . .
Tips on How to Connect at Academic Conferences
I was a kinda awkward teenager. If you are a CS researcher reading this post, then chances are, you were too. How to navigate social situations and make friends is not always intuitive, and has to …
https://kamathematics.wordpress.com/2025/05/01/tips-on-how-to-connect-at-academic-conferences/
3
68
21
The
#ICML2025
@icmlconf.bsky.social
deadline has just passed! Peer review is vital to advancing AI research. We've been conducting a survey experiment at ICML since 2023. Pls take a few minutes to participate in it, sent via email with the subject "[ICML 2025] Author Survey". Thx!
10 months ago
0
1
0
A special issue on large language models (LLMs) and statistics at Stat (
onlinelibrary.wiley.com/journal/2049...
). We're seeking submissions examining LLMs' impact on statistical methods, practice, education, and many more
@amstatnews.bsky.social
loading . . .
Stat
Click on the title to browse this journal
https://onlinelibrary.wiley.com/journal/20491573
12 months ago
0
3
1
A departmental postdoc position opening in my dept:
statistics.wharton.upenn.edu/recruiting/d...
loading . . .
Departmental Postdoctoral Researcher Position
https://statistics.wharton.upenn.edu/recruiting/dept-postdoc-position/
12 months ago
0
4
0
Heading to Vancouver tomorrow for
#NeurIPS2024
, Dec 10-14! Excited to reconnect with colleagues and enjoy Vancouver's seafood! 🦐
12 months ago
0
1
0
reposted by
Weijie Su
Quanta Magazine
about 1 year ago
Machine learning has led to predictive algorithms so obscure that they resist analysis. Where does the field of traditional statistics fit into all of this? Emmanuel Candès asks the question, “Can I trust this?” Tune in to this week’s episode of “The Joy of Why”
listen.quantamagazine.org/jow-321-s
loading . . .
How Is AI Changing the Science of Prediction?
Podcast Episode · The Joy of Why · 11/07/2024 · 37m
https://listen.quantamagazine.org/jow-321-s
0
32
8
Knew nothing about bluesky until today. Immediately stop using X or gradually migrate to bluesky? Is there an optimal switching strategy?
about 1 year ago
0
2
0
you reached the end!!
feeds!
log in