Yun S. Song
@yun-s-song.bsky.social
📤 776
📥 119
📝 62
Professor of EECS and Statistics at UC Berkeley. Mathematical and computational biologist.
Published online on Jan 2, 2025 and just appeared in the December 2025 issue!
add a skeleton here at some point
10 days ago
0
18
4
reposted by
Yun S. Song
Peter Sudmant
11 days ago
The registration deadline is fast approaching for probgen 2026! Abstracts due by January 15, registration by January 31
probgen2026.github.io
loading . . .
Home - ProbGen 2026
Your Site Description
https://probgen2026.github.io/
1
12
14
reposted by
Yun S. Song
Frederick "Erick" Matsen
18 days ago
Over the past 5+ years I've had the honor of working with
@wsdewitt.github.io
@victora.bsky.social
and many others on a project to "replay" affinity maturation evolution from a fixed starting point.
matsen.group/general/2025...
loading . . .
Replaying evolution to learn about the fitness landscape of affinity maturation
A five year collaboration with the Victora lab is bearing fruit for evolutionary biology.
https://matsen.group/general/2025/11/28/replay.html
2
26
17
reposted by
Yun S. Song
Society for Molecular Biology and Evolution
20 days ago
Organisers - Shu Zhang |
@gladstoneinst.bsky.social
Invited Speaker -
@yun-s-song.bsky.social
|
@ucberkeleyofficial.bsky.social
0
3
1
reposted by
Yun S. Song
Mia Levine
about 1 month ago
How to keep in step when your (protein) partner speeds up… Here we investigated the adaptive remodeling of a protein-protein interaction surface essential for telomere protection. Congrats to whole team!
www.science.org/doi/10.1126/...
loading . . .
Rapid compensatory evolution within a multiprotein complex preserves telomere integrity
Intragenomic conflict with selfish genetic elements spurs adaptive changes in subunits of essential multiprotein complexes. Whether and how these adaptive changes disrupt interactions within such comp...
https://www.science.org/doi/10.1126/science.adv0657
6
119
68
reposted by
Yun S. Song
Yun Deng
about 1 month ago
The last work of my PhD is finally out:
www.pnas.org/doi/10.1073/...
! This work is about accurately estimating branch length in the Ancestral Recombination Graph (ARG), which is achieved by a really simple framework with minimal assumptions. (1/n)
loading . . .
PNAS
Proceedings of the National Academy of Sciences (PNAS), a peer reviewed journal of the National Academy of Sciences (NAS) - an authoritative source of high-impact, original research that broadly spans...
https://www.pnas.org/doi/10.1073/pnas.2504461122
1
48
18
An open-rank faculty search in AI + Engineering (Bioengineering included) at UC Berkeley. Due date: Monday, Nov 3, 2025 at 11:59pm (PT) Please help spread the news.
aprecruit.berkeley.edu/JPF05144
loading . . .
Assistant/Associate/Full Professor – Engineering + Artificial Intelligence - College of Engineering (host academic department(s) to be determined)
University of California, Berkeley is hiring. Apply now!
https://aprecruit.berkeley.edu/JPF05144
3 months ago
1
6
3
reposted by
Yun S. Song
Anshul Kundaje
3 months ago
This is truly an incredible breakthrough IMO. Really exemplifies what you get when deep domain expertise (popgen/evolution/disease genetics in this case) fuses with cleverly crafted ML. What u get r sleek, well thought out architectures that absolutely destroy the behemoths. Wow!! 1/
add a skeleton here at some point
1
60
15
We are excited to share GPN-Star, a cost-effective, biologically grounded genomic language modeling framework that achieves state-of-the-art performance across a wide range of variant effect prediction tasks relevant to human genetics.
www.biorxiv.org/content/10.1...
(1/n)
3 months ago
4
174
95
SINGER, our ARG inference method, is finally published and freely available online:
doi.org/10.1038/s415...
It was a long journey – 16 months from initial submission to acceptance. Is it just me, or has peer review gotten more arduous lately? 4+ rounds of review isn't so unusual these days...
loading . . .
Robust and accurate Bayesian inference of genome-wide genealogies for hundreds of genomes - Nature Genetics
SINGER is a method for creating ancestral recombination graphs to understand the genealogical history of genomes. The method has increased speed, and thus scalability, without sacrificing accuracy.
https://doi.org/10.1038/s41588-025-02317-9
4 months ago
1
101
55
reposted by
Yun S. Song
Alan Aw
4 months ago
Hi Bluesky — Dedicating my first post to this work and software, led by the incredibly meticulous and capable
@fandingzhou.bsky.social
! An earlier version of this was shared at the 2022 Bioconductor Conference (
bioc2022.bioconductor.org/schedule/
).
add a skeleton here at some point
1
4
1
reposted by
Yun S. Song
4 months ago
Gene expression changes aren’t just about mean shifts — variability shifts matter too, especially for aging. We're thrilled to introduce QRscore, a flexible non-parametric framework for detecting shifts in mean and variance across conditions.
doi.org/10.1016/j.cr...
1
12
4
reposted by
Yun S. Song
4 months ago
In a new preprint we use deep learning on lineage trees to infer the functional form of the relationship between affinity and fitness that controls antibody evolution in germinal centers:
arxiv.org/abs/2508.09871
🧵
loading . . .
Inference of germinal center evolutionary dynamics via simulation-based deep learning
B cells and the antibodies they produce are vital to health and survival, motivating research on the details of the mutational and evolutionary processes in the germinal centers (GC) from which mature...
https://arxiv.org/abs/2508.09871
1
15
9
Antibodies are highly diverse, but most possible sequences are unstable or polyreactive. In this work, just published in Cell Syst., we propose a new source of data for modeling constraints from these properties. Our models show clear improvements in predicting Ab dysfunction. (1/n)
t.co/qCZERPUMPF
loading . . .
https://authors.elsevier.com/a/1lbX08YyDfuZWX
https://t.co/qCZERPUMPF
5 months ago
1
16
6
reposted by
Yun S. Song
Earth BioGenome Project 🌍
5 months ago
(1/4) 🧬 Why Sequence the Genomes of Earth’s Biodiversity? The Earth BioGenome Project 🌍 is a global network of initiatives working together to create a complete genome library for all Eukaryotic life—from mushrooms 🍄 to mammals 🐘.
#biodiversity
#genomes
#sequence
#earthbiogenome
#education
#stem
1
17
12
reposted by
Yun S. Song
Center for Cancer Immunotherapy and Immunobiology
6 months ago
Germinal center clonal diversity trees as a musical score, a great image to start
@victora.bsky.social
's CCII seminar, "Replaying germinal center evolution on a quantified affinity landscape"
#GerminalCenter
#Immunology
www.ccii.med.kyoto-u.ac.jp/en/event/the...
1
18
8
reposted by
Yun S. Song
Jacob Schreiber
6 months ago
In vivo mapping of mutagenesis sensitivity of human enhancers
www.nature.com/articles/s41...
loading . . .
In vivo mapping of mutagenesis sensitivity of human enhancers - Nature
Human enhancers contain a high density of sequence features that are required for their normal in vivo function.
https://www.nature.com/articles/s41586-025-09182-w
0
48
20
The 2026 Probabilistic Modeling in Genomics (ProbGen) meeting will be held at UC Berkeley, March 25-28, 2026. We have an amazing list of keynote speakers and session chairs:
probgen2026.github.io
Please help spread the news.
loading . . .
Home - ProbGen 2026
Your Site Description
https://probgen2026.github.io
7 months ago
2
69
36
reposted by
Yun S. Song
Gabriel Victora
7 months ago
Wanted to highlight our latest preprint--a huge effort by multiple people and labs, but led primarily by
@wsdewitt.github.io
, Tatsuya Araki, and Ashni Vora, in a very close wet-dry collaboration with
@matsen.bsky.social
’s lab at the Hutch
www.biorxiv.org/content/10.1...
loading . . .
Replaying germinal center evolution on a quantified affinity landscape
Darwinian evolution of immunoglobulin genes within germinal centers (GC) underlies the progressive increase in antibody affinity following antigen exposure. Whereas the mechanics of how competition be...
https://www.biorxiv.org/content/10.1101/2025.06.02.656870v1
1
66
33
reposted by
Yun S. Song
Innovative Genomics Institute
7 months ago
Check out CRISPRpedia, our resource on all things
#CRISPR
! The latest chapter is on CRISPR & ethics:
innovativegenomics.org/crisprpedia/...
CRISPRpedia features 85+ original illustrations that are free to download & use for non-commercial purposes!
#STEMeducation
#STEMed
#bioethics
#SciArt
0
8
3
reposted by
Yun S. Song
Jeff Spence
7 months ago
How well can deep learning models predict the effect of modifying chromatin on gene expression??? Our work -- led by Sanjit Batra and Alan Cabrera when they were in
@yun-s-song.bsky.social
’s and Isaac Hilton’s labs -- tries to answer this. 🧵🧬🧪
elifesciences.org/reviewed-pre...
loading . . .
Predicting the effect of CRISPR-Cas9-based epigenome editing
https://elifesciences.org/reviewed-preprints/92991
1
14
3
reposted by
Yun S. Song
Charlie Pugh
7 months ago
New preprint in collaboration with
@paulinanunezv.bsky.social
supervised by
@jonnyfrazer.bsky.social
and Mafalda Dias – we propose a simple approach to improving zero-shot variant effect prediction in pre-existing protein and genome language models: 🧶 1/n
www.biorxiv.org/content/10.1...
loading . . .
From Likelihood to Fitness: Improving Variant Effect Prediction in Protein and Genome Language Models
Generative models trained on natural sequences are increasingly used to predict the effects of genetic variation, enabling progress in therapeutic design, disease risk prediction, and synthetic biolog...
https://www.biorxiv.org/content/10.1101/2025.05.20.655154v1
1
75
27
How can one efficiently simulate phylodynamics for populations with billions of individuals, as is typical in many applications, e.g., viral evolution and cancer genomics? In this work with M. Celentano,
@wsdewitt.github.io
, & S. Prillo, we provide a solution.
doi.org/10.1073/pnas...
1/n
add a skeleton here at some point
7 months ago
1
37
16
reposted by
Yun S. Song
Innovative Genomics Institute
8 months ago
In a medical breakthrough, a team including IGI’s
@urnov.bsky.social
&
@giannikopoulosp.bsky.social
created an on-demand
#CRISPR
therapy for an infant with a deadly gene mutation — developed, approved, and delivered to the patient in just 6 months. Read more:
ow.ly/G0Bg50VTonC
#RareDisease
🧬
0
49
25
reposted by
Yun S. Song
Innovative Genomics Institute
7 months ago
Jennifer Doudna
@jenniferdoudna.bsky.social
@doudna-lab.bsky.social
speaks with Cleo Abrams on the history and future of
#CRISPR
🧬. Watch here:
youtu.be/0OXaanDHENI?..
.
loading . . .
You Can Fix Your DNA... Starting Now (feat. Nobel Prize Winner)
YouTube video by Cleo Abram
https://youtu.be/0OXaanDHENI?..
0
11
6
reposted by
Yun S. Song
Andrea Montanari
8 months ago
Overfitting is among the conceptually most interesting problems in machine learning. I am happy of several new phenomena we began to understand with Pierfrancesco Urbani. Alert: mostly non-rigorous! (Celebrating Jorge Kurchan)
web.stanford.edu/~montanar/OT...
loading . . .
https://web.stanford.edu/~montanar/OTHER/TALKS/paris2025.pdf
1
27
6
reposted by
Yun S. Song
Heng Li
8 months ago
If you want to check if a human gene has copy-number changes or lands in a complex region, try
pangene.bioinweb.org
. Recently updated with more and better assemblies.
2
44
14
Thrilled to see my digital art on the cover of Trends Genet. The two binary strings represent reverse-complementary DNA sequences (00=A, 01=C, 10=G, 11=T) and the connecting rectangles represent “embeddings” learned by DNA language models. Pls check out our article as well:
doi.org/10.1016/j.ti...
9 months ago
0
69
14
In our updated TraitGym preprint (w/
@gonzalobenegas.bsky.social
& Gökcen Eraslan), we evaluate Evo 2 on regulatory variants associated with human traits. We see marked performance gains with scale on Mendelian traits, although still a bit behind alignment-based methods.
doi.org/10.1101/2025...
1/n
10 months ago
1
32
15
reposted by
Yun S. Song
Gonzalo Benegas
11 months ago
Can DNA sequence models predict mutations affecting human traits? We introduce TraitGym, a curated benchmark of causal regulatory variants for 113 Mendelian & 83 complex traits, and evaluate functional genomics and DNA language models. Joint work w/ Gökcen Eraslan and
@yun-s-song.bsky.social
🧵👇
add a skeleton here at some point
1
28
17
reposted by
Yun S. Song
Hani Goodarzi
12 months ago
A month ago we
@vevotherapeutics.bsky.social
announced that we have generated the largest single-cell perturbation atlas in history, Tahoe-100M. Today, we announce that we will fully open-source Tahoe-100M in Feb, as part of a collaboration with NVidia health to train cell state models.
4
116
38
Our work, which shows statistical issues with the previous claim of a severe ancient bottleneck in the ancestry of African populations, has been selected as a Featured article in Genetics.
doi.org/10.1093/gene...
loading . . .
A previously reported bottleneck in human ancestry 900 kya is likely a statistical artifact
Hu et al. (Science, 2023) recently inferred a severe ancient bottleneck around 900 thousand years (kya) ago in African ancestry but found no similar eviden
https://doi.org/10.1093/genetics/iyae192
12 months ago
0
15
4
Coincidentally, another article from my lab on DNA language models got published on the same day as GPN-MSA. It's freely available for 50 days from this link:
authors.elsevier.com/a/1kNCscQbJB...
Genomic language models: opportunities and challenges Please share with your colleagues.
loading . . .
https://authors.elsevier.com/a/1kNCscQbJBLQC
12 months ago
1
10
2
Happy New Year! Our GPN-MSA paper is finally published, under a slightly different title from the preprint. Please check it out and share it with your colleagues:
doi.org/10.1038/s415...
1/4
loading . . .
A DNA language model based on multispecies alignment predicts the effects of genome-wide variants - Nature Biotechnology
A language model predicts the effects of genetic variants in the human genome.
https://doi.org/10.1038/s41587-024-02511-w
12 months ago
1
16
8
reposted by
Yun S. Song
Jeff Spence
about 1 year ago
What do GWAS and rare variant burden tests discover, and why? Do these studies find the most IMPORTANT genes? If not, how DO they rank genes? Here we present a surprising result: these studies actually test for SPECIFICITY! A 🧵on what this means... (🧪🧬)
www.biorxiv.org/content/10.1...
loading . . .
Specificity, length, and luck: How genes are prioritized by rare and common variant association studies
Standard genome-wide association studies (GWAS) and rare variant burden tests are essential tools for identifying trait-relevant genes. Although these methods are conceptually similar, we show by anal...
https://www.biorxiv.org/content/10.1101/2024.12.12.628073v1
4
208
103
reposted by
Yun S. Song
Ian Holmes
about 1 year ago
Just had a thesis meeting with Yun Deng, finishing his PhD in
@yun-s-song.bsky.social
’s lab and beginning a postdoc with
@jkpritch.bsky.social
. His SINGER software for MCMC inference of ancestral recombination graphs is a really impressive contribution to the field
www.biorxiv.org/content/10.1...
loading . . .
Robust and Accurate Bayesian Inference of Genome-Wide Genealogies for Large Samples
The Ancestral Recombination Graph (ARG), which describes the full genealogical history of a sample of genomes, is a vital tool in population genomics and biomedical research. Recent advancements have ...
https://www.biorxiv.org/content/10.1101/2024.03.16.585351v1
2
18
7
reposted by
Yun S. Song
Michael Skinnider
about 1 year ago
Thrilled to share our approach for language model-guided discovery of unknown mammalian metabolites: DeepMet. We’ve now used this approach to discover ~50 new human and mouse metabolites!
www.biorxiv.org/content/10.1...
loading . . .
Language model-guided anticipation and discovery of unknown metabolites
Despite decades of study, large parts of the mammalian metabolome remain unexplored. Mass spectrometry-based metabolomics routinely detects thousands of small molecule-associated peaks within human ti...
https://www.biorxiv.org/content/10.1101/2024.11.13.623458v1
0
12
4
Large protein language models can learn complex epistatic interactions, but how much does that help with predicting variant effects? In this NeurIPS article, we show that classical independent-sites phylogenetic models can outperform pLMs on this task. 1/7
openreview.net/forum?id=H7m...
loading . . .
Ultrafast classical phylogenetic method beats large protein...
Amino acid substitution rate matrices are fundamental to statistical phylogenetics and evolutionary biology. Estimating them typically requires reconstructed trees for massive amounts of aligned...
https://openreview.net/forum?id=H7mENkYB2J
about 1 year ago
2
90
46
you reached the end!!
feeds!
log in