Andrew Gordon Wilson
@andrewgwils.bsky.social
📤 2604
📥 193
📝 96
Machine Learning Professor
https://cims.nyu.edu/~andrewgw
I'm excited to be giving a keynote talk at the AutoML conference 9-10 am at Cornell Tech tomorrow! I'm presenting "Prescriptions for Universal Learning". I'll talk about how we can enable automation, which I'll argue is the defining feature of ML.
2025.automl.cc/program/
13 days ago
0
6
0
Research doesn't go in circles, but in spirals. We return to the same ideas, but in a different and augmented form.
21 days ago
0
21
0
reposted by
Andrew Gordon Wilson
NYU Center for Data Science
26 days ago
CDS/Courant Professor Andrew Gordon Wilson (
@andrewgwils.bsky.social
) argues mysterious behavior in deep learning can be explained by decades-old theory, not new paradigms: PAC-Bayes bounds, soft biases, and large models with a soft simplicity bias.
nyudatascience.medium.com/deep-learnin...
loading . . .
Deep Learning’s Most Puzzling Phenomena Can Be Explained by Decades-Old Theory
Andrew Gordon Wilson argues that many generalization phenomena in deep learning can be explained using decades-old theoretical tools.
https://nyudatascience.medium.com/deep-learnings-most-puzzling-phenomena-can-be-explained-by-decades-old-theory-91d4cf235a89
0
8
1
Regardless of whether you plan to use them in applications, everyone should learn about Gaussian processes, and Bayesian methods. They provide a foundation for reasoning about model construction and all sorts of deep learning behaviour that would otherwise appear mysterious.
about 1 month ago
3
56
6
A common takeaway from "the bitter lesson" is we don't need to put effort into encoding inductive biases, we just need compute. Nothing could be further from the truth! Better inductive biases mean better scaling exponents, which means exponential improvements with computation.
about 1 month ago
1
19
4
Gould mostly recorded baroque and early classical. He only recorded a single Chopin piece, as a one-off broadcast. But like many of his efforts, it's profoundly thought provoking, the end product as much Gould as it is Chopin. I love the last mvt (20:55+).
www.youtube.com/watch?v=NAHE...
loading . . .
Glenn Gould plays Chopin Piano Sonata No. 3 in B minor Op.58
YouTube video by The Piano Experience
https://www.youtube.com/watch?v=NAHE8PTR8tE
about 2 months ago
0
4
0
Whatever you do, just don't be boring.
about 2 months ago
1
4
1
I had a great time presenting "It's Time to Say Goodbye to Hard Constraints" at the Flatiron Institute. In this talk, I describe a philosophy for model construction in machine learning. Video now online!
www.youtube.com/watch?v=LxuN...
loading . . .
It's Time to Say Goodbye to Hard (equivariance) Constraints - Andrew Gordon Wilson
YouTube video by LoG Meetup NYC
https://www.youtube.com/watch?v=LxuNC3I7Fxg
2 months ago
0
13
2
Excited to be presenting my paper "Deep Learning is Not So Mysterious or Different" tomorrow at ICML, 11 am - 1:30 pm, East Exhibition Hall A-B, E-500. I made a little video overview as part of the ICML process (viewable from Chrome):
recorder-v3.slideslive.com#/share?share...
add a skeleton here at some point
2 months ago
0
26
6
Our new ICML paper discovers scaling collapse: through a simple affine transformation, whole training loss curves across model sizes with optimally scaled hypers collapse to a single universal curve! We explain the collapse, providing a diagnostic for model scaling.
arxiv.org/abs/2507.02119
1/3
3 months ago
3
32
5
Excited about our new ICML paper, showing how algebraic structure can be exploited for massive computational gains in population genetics.
add a skeleton here at some point
3 months ago
0
3
1
Machine learning is perhaps the only discipline that has become less mature over time. A reverse metamorphosis, from butterfly to caterpillar.
3 months ago
1
22
3
AI this, AI that, the implications of AI for X... can we just never talk about AI again?
3 months ago
1
9
0
Really excited about our new paper, "Why Masking Diffusion Works: Condition on the Jump Schedule for Improved Discrete Diffusion". We explain the mysterious success of masking diffusion to propose new diffusion models that work well in a variety settings, including proteins, images, and text!
add a skeleton here at some point
3 months ago
0
6
0
A really outstanding interview of Terence Tao, providing an introduction to many topics, including the math of general relativity (
youtube.com/watch?v=HUkB...
). I love relativity, and in a recent(ish) paper we also consider the wave maps equation (section 5,
arxiv.org/abs/2304.14994
).
loading . . .
Terence Tao: Hardest Problems in Mathematics, Physics & the Future of AI | Lex Fridman Podcast #472
YouTube video by Lex Fridman
https://youtube.com/watch?v=HUkBz-cdB-k
3 months ago
0
14
3
AI benchmarking culture is completely out of control. Tables with dozens of methods, datasets, and bold numbers, trying to answer a question that perhaps no one should be asking anymore.
4 months ago
1
19
6
We have a strong bias to overestimate the speed of technological innovation and impact. See past claims about autonomous driving, AI curing diseases... or the timeline in every sci-fi book ever written. Where is my flying car?
5 months ago
1
9
1
My new paper "Deep Learning is Not So Mysterious or Different":
arxiv.org/abs/2503.02113
. Generalization behaviours in deep learning can be intuitively understood through a notion of soft inductive biases, and formally characterized with countable hypothesis bounds! 1/12
7 months ago
6
211
59
I had a great time talking with
@anilananth.bsky.social
as part of the Simons Institute Polylogues. We cover universal learning, generalization phenomena, how transformers are both surprisingly general but also limited, and the difference between statistics and ML!
www.youtube.com/watch?v=Aja0...
loading . . .
Andrew Gordon Wilson | Polylogues
YouTube video by Simons Institute
https://www.youtube.com/watch?v=Aja0kZeWRy4
7 months ago
0
8
2
These DeepSeek results mostly just reflect the diminishing gap between open and closed models, such that any company with billions can start with llama as a baseline, make some tweaks, and appear like the next OpenAI. Going forward, data and scale won't be the decisive advantage.
8 months ago
2
16
1
It's not the size of your parameter space that matters, it's how you use it.
8 months ago
1
9
2
With interview season coming, don't despair. I conspicuously forgot the name of the place I was interviewing in a 1-1. I made sure to name drop the university a bunch in my job talk right after, just so my allies could be like "he really does know the name".
8 months ago
0
3
0
There's apparently another Andrew Wilson at NYU who teaches piano lessons. I get a lot of emails meant for him. Maybe I'll charge his rate minus $1.
8 months ago
0
6
0
reposted by
Andrew Gordon Wilson
Brandon Amos
9 months ago
📢 My team at Meta (including Yaron Lipman and Ricky Chen) is hiring a postdoctoral researcher to help us build the next generation of flow, transport, and diffusion models! Please apply here and message me:
www.metacareers.com/jobs/1459691...
loading . . .
Postdoctoral Researcher, Fundamental AI Research (PhD)
Meta's mission is to build the future of human connection and the technology that makes it possible.
https://www.metacareers.com/jobs/1459691901359421/
1
53
15
We're excited to announce the ICML 2025 call for workshops! The CFP and submission advice can be found at:
icml.cc/Conferences/...
. The deadline is Feb 10. Submit some creative proposals!
loading . . .
ICML 2025 Call for Workshops
https://icml.cc/Conferences/2025/CallForWorkshops
9 months ago
0
15
10
Happy New Year everyone! Excited for the year ahead.
9 months ago
0
5
0
Many of the greatest papers, now canonical works, have a story of resistance, tension, and, finally, a crucial advocate. It's shockingly common. Why is there a bias against excellence? And what happens to those papers, those people, when no one has the courage to advocate?
9 months ago
1
12
2
Research scientists using industry GPUs these days... "But Mr Garnier… we're scientists, we want to change the world. You have the finest GPUs that money can buy! You employ 3000 research staff."
www.youtube.com/watch?v=hdHF...
loading . . .
That Mitchell and Webb Look - The Garnier Laboratoire
YouTube video by fanvideos4u
https://www.youtube.com/watch?v=hdHFmc9oiKY
9 months ago
0
6
0
This is your monthly reminder that understanding deep learning does not require rethinking generalization, and it never did.
9 months ago
3
22
3
So excited about this new work on Bayesian optimization for antibody design! It works by teaching a generative model how the human immune system evolves antibodies for strong and stable binders. Satisfying mix of ML+Bio. Check out the great thread from
@alannawzadamin.bsky.social
and the paper!
add a skeleton here at some point
9 months ago
0
20
0
Excited for the
#NeurIPS2024
workshops today! I'll be speaking at: (1) Science of DL (panel, 3:10-4:10,
scienceofdlworkshop.github.io/schedule/
) (2) "Time Series in the Age of Large Models" (talk, 4:39-5:14,
neurips-time-series-workshop.github.io
).
loading . . .
Schedule | SciForDL'24
https://scienceofdlworkshop.github.io/schedule/
9 months ago
2
25
1
reposted by
Andrew Gordon Wilson
Alan Amin
9 months ago
New model trained on new dataset of nearly a million evolving antibody families at AIDrugX workshop Sunday at 4:20 pm (#76)
#Neurips
! Collab between
@andrewgwils.bsky.social
and BigHatBio. Stay tuned for full thread on how we used the model to optimize antibodies in the lab in coming days!
0
4
1
It feels like _so_ much time has passed since NeurIPS in New Orleans last year. We're in a different universe.
9 months ago
1
7
0
Nice crowd and lots of engagement at our NeurIPS poster today, with Sanae Lotfi presenting on token-level generalization bounds for LLMs!
arxiv.org/abs/2407.18158
9 months ago
1
40
0
Every year at NeurIPS, I get a sense of where the community is headed. I'm so happy that the era of larger language models on larger datasets is coming to an end.
9 months ago
4
112
7
Is wearing a scarf indoors a power move?
10 months ago
4
4
0
The logo needs more affirmation!
add a skeleton here at some point
10 months ago
0
1
0
I wanted to make my first post about a project close to my heart. Linear algebra is an underappreciated foundation for machine learning. Our new framework CoLA (Compositional Linear Algebra) exploits algebraic structure arising from modelling assumptions for significant computational savings! 1/4
10 months ago
3
141
23
reposted by
Andrew Gordon Wilson
Anil Ananthaswamy
10 months ago
Will simply scaling up LLMs get us to AGI? My feature for Nature, w/ inputs from
@fchollet.bsky.social
@rao2z.bsky.social
@yoshuabengio.bsky.social
@melaniemitchell.bsky.social
@dileeplearning.bsky.social
@andrewgwils.bsky.social
, Raia Hadsell, Keyon Vafa, Karl Friston
www.nature.com/articles/d41...
loading . . .
How close is AI to human-level intelligence?
Large language models such as OpenAI’s o1 have electrified the debate over achieving artificial general intelligence, or AGI. But they are unlikely to reach this milestone on their own.
https://www.nature.com/articles/d41586-024-03905-1
4
97
34
you reached the end!!
feeds!
log in