Gonzalo Benegas
@gonzalobenegas.bsky.social
📤 254
📥 806
📝 17
Comp Bio Postdoc @ UC Berkeley
https://gonzalobenegas.github.io/
reposted by
Gonzalo Benegas
Yun S. Song
9 days ago
We are excited to share GPN-Star, a cost-effective, biologically grounded genomic language modeling framework that achieves state-of-the-art performance across a wide range of variant effect prediction tasks relevant to human genetics.
www.biorxiv.org/content/10.1...
(1/n)
4
165
92
reposted by
Gonzalo Benegas
Joana L. Rocha
3 months ago
I am thrilled to announce that in January 2026 I will be starting my own lab at NYU Biology! Soon enough I will be recruiting postdocs and students! Please reach out if you are interested with a CV and description of your research interests, or if you know of people who could be interested! 🧬🗽 🦊
7
82
19
reposted by
Gonzalo Benegas
Yun S. Song
4 months ago
How can one efficiently simulate phylodynamics for populations with billions of individuals, as is typical in many applications, e.g., viral evolution and cancer genomics? In this work with M. Celentano,
@wsdewitt.github.io
, & S. Prillo, we provide a solution.
doi.org/10.1073/pnas...
1/n
add a skeleton here at some point
1
37
16
reposted by
Gonzalo Benegas
Yun S. Song
6 months ago
Thrilled to see my digital art on the cover of Trends Genet. The two binary strings represent reverse-complementary DNA sequences (00=A, 01=C, 10=G, 11=T) and the connecting rectangles represent “embeddings” learned by DNA language models. Pls check out our article as well:
doi.org/10.1016/j.ti...
0
69
14
reposted by
Gonzalo Benegas
Yun S. Song
7 months ago
In our updated TraitGym preprint (w/
@gonzalobenegas.bsky.social
& Gökcen Eraslan), we evaluate Evo 2 on regulatory variants associated with human traits. We see marked performance gains with scale on Mendelian traits, although still a bit behind alignment-based methods.
doi.org/10.1101/2025...
1/n
1
32
15
Can DNA sequence models predict mutations affecting human traits? We introduce TraitGym, a curated benchmark of causal regulatory variants for 113 Mendelian & 83 complex traits, and evaluate functional genomics and DNA language models. Joint work w/ Gökcen Eraslan and
@yun-s-song.bsky.social
🧵👇
add a skeleton here at some point
8 months ago
1
27
16
reposted by
Gonzalo Benegas
bioRxiv Genetics
8 months ago
Benchmarking DNA Sequence Models for Causal Regulatory Variant Prediction in Human Genetics
https://www.biorxiv.org/content/10.1101/2025.02.11.637758v1
0
8
2
reposted by
Gonzalo Benegas
Yun S. Song
9 months ago
Our work, which shows statistical issues with the previous claim of a severe ancient bottleneck in the ancestry of African populations, has been selected as a Featured article in Genetics.
doi.org/10.1093/gene...
loading . . .
A previously reported bottleneck in human ancestry 900 kya is likely a statistical artifact
Hu et al. (Science, 2023) recently inferred a severe ancient bottleneck around 900 thousand years (kya) ago in African ancestry but found no similar eviden
https://doi.org/10.1093/genetics/iyae192
0
15
4
reposted by
Gonzalo Benegas
Yun S. Song
9 months ago
Coincidentally, another article from my lab on DNA language models got published on the same day as GPN-MSA. It's freely available for 50 days from this link:
authors.elsevier.com/a/1kNCscQbJB...
Genomic language models: opportunities and challenges Please share with your colleagues.
loading . . .
https://authors.elsevier.com/a/1kNCscQbJBLQC
1
10
2
reposted by
Gonzalo Benegas
Yun S. Song
9 months ago
Happy New Year! Our GPN-MSA paper is finally published, under a slightly different title from the preprint. Please check it out and share it with your colleagues:
doi.org/10.1038/s415...
1/4
loading . . .
A DNA language model based on multispecies alignment predicts the effects of genome-wide variants - Nature Biotechnology
A language model predicts the effects of genetic variants in the human genome.
https://doi.org/10.1038/s41587-024-02511-w
1
16
8
reposted by
Gonzalo Benegas
Nature Biotechnology
9 months ago
A DNA language model based on multispecies alignment predicts the effects of genome-wide variants -
@yun-s-song.bsky.social
go.nature.com/4gWppWg
loading . . .
A DNA language model based on multispecies alignment predicts the effects of genome-wide variants - Nature Biotechnology
A language model predicts the effects of genetic variants in the human genome.
https://go.nature.com/4gWppWg
0
31
13
reposted by
Gonzalo Benegas
Yun S. Song
11 months ago
Large protein language models can learn complex epistatic interactions, but how much does that help with predicting variant effects? In this NeurIPS article, we show that classical independent-sites phylogenetic models can outperform pLMs on this task. 1/7
openreview.net/forum?id=H7m...
loading . . .
Ultrafast classical phylogenetic method beats large protein...
Amino acid substitution rate matrices are fundamental to statistical phylogenetics and evolutionary biology. Estimating them typically requires reconstructed trees for massive amounts of aligned...
https://openreview.net/forum?id=H7mENkYB2J
2
92
46
you reached the end!!
feeds!
log in