Leonardo Cotta
@cottascience.bsky.social
📤 1026
📥 250
📝 57
floptimistic @ EIT from BH🔺🇧🇷
http://cottascience.github.io
it's never been more fun to code, because it's never been more valuable to care about your code, to optimize the details and to write beautiful and compact code. this is the valuable type of code now. it's not just the age of research, it's the age of computer science.
20 days ago
0
1
0
I can only imagine how crazy it must be to be a PhD student submitting to ML conferences now. The process has always been noisy, but at this point it's selecting for either obfuscation or shallow ideas. You either intimidate the reviewer, or you write a blog post in latex.
2 months ago
0
2
0
reposted by
Leonardo Cotta
The Matter Lab
5 months ago
We're excited to present our latest article in Nature Machine Intelligence - Boosting the predictive power of protein representations with a corpus of text annotations. Link:
www.nature.com/articles/s42...
[1/4]
1
12
5
the goat of brazilian music w/ the best of (current) american music
www.youtube.com/watch?v=jFUh...
loading . . .
Milton Nascimento & esperanza spalding: Tiny Desk (Home) Concert
YouTube video by NPR Music
https://www.youtube.com/watch?v=jFUhTmOSdGQ&t=31s
5 months ago
0
2
0
I loved this new preprint by Lourie/Hu/
@kyunghyuncho.bsky.social
. If you really wanna convince someone youre training a foundation model, or proposing better methodology, loss scaling laws aren't enough. It has to be tied w/ downstream performance. it shouldn't be vibes
arxiv.org/abs/2507.00885
loading . . .
Scaling Laws Are Unreliable for Downstream Tasks: A Reality Check
Downstream scaling laws aim to predict task performance at larger scales from pretraining losses at smaller scales. Whether this prediction should be possible is unclear: some works demonstrate that t...
https://arxiv.org/abs/2507.00885
6 months ago
0
5
1
I'm very excited about our new work: SciGym. How can we scale scientific agents' evaluation? TLDR; Systems biologists have spent decades encoding biochemical networks (metabolic pathways, gene regulation, etc.) into machine-runnable systems. We can use these as "dry labs" to test AI agents!
6 months ago
1
2
0
I wish we had an ML equivalent of SOSA (Symposium On Simplicity in Algorithms). "simpler algorithms manifest a better understanding of the problem at hand; they are more likely to be implemented and trusted by practitioners; they are more easily taught"
www.siam.org/conferences-...
.
7 months ago
1
3
0
reposted by
Leonardo Cotta
Quaid Morris
8 months ago
Please check out our new approach to modeling somatic mutation signatures. DAMUTA has independent Damage and Misrepair signatures whose activities are more interpretable and more predictive of DNA repair defects, than COSMIC SBS signatures 🧬🖥️🧪
www.biorxiv.org/content/10.1...
loading . . .
Damage and Misrepair Signatures: Compact Representations of Pan-cancer Mutational Processes
Mutational signatures of single-base substitutions (SBSs) characterize somatic mutation processes which contribute to cancer development and progression. However, current mutational signatures do not ...
https://www.biorxiv.org/content/10.1101/2025.05.29.656360v1
0
41
17
I haven't been up to date with the model collapse literature, but it's crazy the amount of papers that consider the case where people only reuse data from the model distribution. This never happens, there's always some human curation or conditioning that yields some type of "real-world, new, data".
9 months ago
0
2
0
This is my favourite "graph paper" of the last 1 or 2 years. We also need to start including non-NN baselines, e.g. fingerprints+catboost ---if the goal is real-world impact and not getting it published asap. I also recommend following
@wpwalters.bsky.social
's blog.
arxiv.org/abs/2502.14546
loading . . .
Position: Graph Learning Will Lose Relevance Due To Poor Benchmarks
While machine learning on graphs has demonstrated promise in drug design and molecular property prediction, significant benchmarking challenges hinder its further progress and relevance. Current bench...
https://arxiv.org/abs/2502.14546
10 months ago
1
7
1
reposted by
Leonardo Cotta
Derek Thompson
11 months ago
Unbelievable news. Pancreatic is one of the deadliest cancers. New paper shows personalized mRNA vaccines can induce durable T cells that attack pancreatic cancer, with 75% of patients cancer free at three years—far, far better than standard of care.
www.nature.com/articles/s41...
140
7271
2243
reposted by
Leonardo Cotta
Thomas Wolf
11 months ago
After 6+ months in the making and over a year of GPU compute, we're excited to release the "Ultra-Scale Playbook":
hf.co/spaces/nanot...
A book to learn all about 5D parallelism, ZeRO, CUDA kernels, how/why overlap compute & coms with theory, motivation, interactive plots and 4000+ experiments!
loading . . .
The Ultra-Scale Playbook - a Hugging Face Space by nanotron
The ultimate guide to training LLM on large GPU Clusters
http://hf.co/spaces/nanotron/ultrascale-playbook
2
180
57
I've always hated the "reasoning models" for code assistance since I think the most useful application of LLMs is really writing the boring helper functions and letting us focus on the hard work. However, I found o3 to be particularly useful when debugging ML code, e.g., 1/2
11 months ago
1
1
0
The whole DeepSeek-R1 thing just highlights computer science's main feature: you can do A LOT with a small team and some (limited) resources. This is how we've been able to scale innovation and why free software is important.
12 months ago
0
6
0
This is an amazing resource (of resources) for machine learners
add a skeleton here at some point
12 months ago
0
2
0
reposted by
Leonardo Cotta
Sara Magliacane
12 months ago
Sad after
#AISTATS2025
and
#ICLR2025
notifications? As we say in Italy, when a door closes, a bigger one opens ;) If you have a fantastic paper on
#uncertainty
#AI
#ML
#causality
#statML
#probabilisticmodels
#reasoning
#impreciseprobabilities
etc, consider submitting to
#UAI2025
🇧🇷 deadline 10 Feb 💥
add a skeleton here at some point
2
42
16
reposted by
Leonardo Cotta
Bruno Ribeiro (at #NeurIPS2024)
about 1 year ago
Slides of my presentation "Mathematical Foundations of Graph Foundation Models" yesterday at the AMS Session of the
#JMM2025
. The accompanying paper is coming soon.
www.cs.purdue.edu/homes/ribeir...
0
5
2
Learning Rust ~properly~ during my break and wow -- absolutely worth it! While we're all chasing GPU optimization, there's something magical about crafting efficient CPU-based apps. Clean and fast data processing can change our lives ;)
about 1 year ago
1
2
0
reposted by
Leonardo Cotta
Eugene Vinitsky 🍒
about 1 year ago
My model is that these things are extremely helpful above some skill bar and extremely harmful below some skill bar
8
97
5
very cool observations about using smiles/graphs vs fingerprints. TDLR; fingerprints only capture certain properties marginally, and their combinations can often give rise to something new/different.
www.deepmedchem.com/articles/wha...
loading . . .
What Can Neural Network Embeddings Do That Fingerprints Can’t?
https://www.deepmedchem.com/articles/what-can-neural-network-embeddings-do
about 1 year ago
0
5
1
reposted by
Leonardo Cotta
Nikita Dhawan
about 1 year ago
Presenting our poster at NeurIPS! Please come chat about estimating causal effects from user/patient-reported experiences: Thursday, 11AM, West Ballroom A-D #5110.
add a skeleton here at some point
0
2
1
If you're interested in {causality, language, healthcare}, stop by! Thursday 11am - 2pm West Ballroom A-D #5110
add a skeleton here at some point
about 1 year ago
0
3
1
reposted by
Leonardo Cotta
Bruno Ribeiro (at #NeurIPS2024)
about 1 year ago
I'm told this is a more intellectual version of ML Twitter :). I have a question... What papers have made good *theoretical* advances towards graph foundation models? Jan 8th 1-2pm I am giving a talk at the Joint Mathematics Meeting on the topic
meetings.ams.org/math/jmm2025...
loading . . .
<p>Mathematical Foundations of Knowledge Graph Foundation Models</p>
One potential definition of a knowledge graph foundation model is one where a g...
https://meetings.ams.org/math/jmm2025/meetingapp.cgi/Paper/41397
0
2
1
reposted by
Leonardo Cotta
Rahul G. Krishnan
about 1 year ago
b] ~Billions of dollars each year are spent on trials to assess interventions. Can we use crowdsourced data to know which intervention is likely to work ahead of time? Doing so requires answering a causal question! But the data to answer this question is locked in unstructured text. 🧵(5/7)
1
0
1
reposted by
Leonardo Cotta
Polaris
about 1 year ago
What are the most interesting datasets and benchmark-related work for ML in drug discovery at NeurIPS? We’ll be at the conference doing short interviews with researchers and handing out some Polaris merch! Here’s who we have on the shortlist. 🧵
2
14
6
reposted by
Leonardo Cotta
Andrei Manolache
about 1 year ago
1/6 We're excited to share our
#NeurIPS2024
paper: Probabilistic Graph Rewiring via Virtual Nodes! It addresses key challenges in GNNs, such as over-squashing and under-reaching, while reducing reliance on heuristic rewiring. w/ Chendi Qian,
@christophermorris.bsky.social
@mniepert.bsky.social
🧵
1
30
7
Dropping into
#NeurIPS2024
in Raincouver 🌧️ next week (Dec 9-15)! Hit me up if you wanna catch up, talk about {AI, science, causality} or whatever fun thing you're building ;)
about 1 year ago
0
3
0
Unless we hold reviewers and ACs accountable, especially ACs in this case, the acceptance of a paper will be determined by whether your paper got active or inactive reviewers/ac. This is even worse and more frustrating than the usual reviewer quality lottery.
add a skeleton here at some point
about 1 year ago
3
3
2
I can’t believe we can just open an app and see arxiv links, cat pictures, and people being civilized. The future has arrived 🌈🦄
about 1 year ago
0
2
0
reposted by
Leonardo Cotta
David Nemer
over 1 year ago
Great piece by
@yasmincurzi.com
for
@theconversation.bsky.social
about the clash between Brazil's Supreme Court and Elon Musk- and the possible implications for platform regulation in the country.
theconversation.com/elon-musks-f...
loading . . .
Elon Musk’s feud with Brazilian judge is much more than a personal spat − it’s about national sovereignty, freedom of speech and the rule of law
Brazil’s attempt to strike a balance between free speech and regulation of online platforms has become politicized – complicating future legislation.
https://theconversation.com/elon-musks-feud-with-brazilian-judge-is-much-more-than-a-personal-spat-its-about-national-sovereignty-freedom-of-speech-and-the-rule-of-law-238264
4
274
57
reposted by
Leonardo Cotta
Yasmin Curzi
over 1 year ago
escrevi para o The Conversation US sobre o embate entre STF e Elon Musk, destacando suas possíveis implicações para a regulação de plataformas no país.
theconversation.com/elon-musks-f...
loading . . .
Elon Musk’s feud with Brazilian judge is much more than a personal spat − it’s about national sovereignty, freedom of speech and the rule of law
Brazil’s attempt to strike a balance between free speech and regulation of online platforms has become politicized – complicating future legislation.
https://theconversation.com/elon-musks-feud-with-brazilian-judge-is-much-more-than-a-personal-spat-its-about-national-sovereignty-freedom-of-speech-and-the-rule-of-law-238264
1
25
6
reposted by
Leonardo Cotta
Sherri Rose
over 1 year ago
bsky is growing, hi! Some areas I work in: Causal ML
drsherrirose.org/targeted-learning-book
Generalizability
onlinelibrary.wiley.com/doi/10.1111/...
Ethical ML
www.annualreviews.org/doi/abs/10.1...
Plan payment
www.journals.uchicago.edu/doi/10.1086/...
CKD
proceedings.mlr.press/v248/cusick24a.html
0
29
6
you reached the end!!
feeds!
log in