Yunha Hwang
@microyunha.bsky.social
π€ 1242
π₯ 1112
π 28
Building genomic intelligence @ Tatta Bio
pinned post!
At Tatta Bio, we have been thinking deeply about the sequence-to-function problem. We believe that before AI can power functional prediction, we first need to rethink how we curate, manage, and share sequence data. Here, we share our initial ideas on what we are building next:
loading . . .
Today's sequence data infrastructure is set up for failure in the age of AI.
Building an open and collaborative sequence platform for both Human and AI scientists.
https://tattabio.substack.com/p/todays-sequence-data-infrastructure
4 months ago
1
8
4
reposted by
Yunha Hwang
Axel Visel
about 1 month ago
Ready to explore New Lineages of Life with
@jgi.doe.gov
? π§¬π¦ Registration for our 2025 NeLLi Symposium is now open. For the first time in collaboration with
@unlv.edu
Mark the date: November 6-7 in Las Vegas, NV
add a skeleton here at some point
1
6
3
At Tatta Bio, we have been thinking deeply about the sequence-to-function problem. We believe that before AI can power functional prediction, we first need to rethink how we curate, manage, and share sequence data. Here, we share our initial ideas on what we are building next:
loading . . .
Today's sequence data infrastructure is set up for failure in the age of AI.
Building an open and collaborative sequence platform for both Human and AI scientists.
https://tattabio.substack.com/p/todays-sequence-data-infrastructure
4 months ago
1
8
4
reposted by
Yunha Hwang
Florian Trigodet
5 months ago
I am very happy (and anxious) to share with you our most recent work in which we evaluated four of the most popular long-read assemblers,
www.biorxiv.org/content/10.1...
and tell you just a little bit about it in the following π§΅
loading . . .
Assemblies of long-read metagenomes suffer from diverse errors
Genomes from metagenomes have revolutionised our understanding of microbial diversity, ecology, and evolution, propelling advances in basic science, biomedicine, and biotechnology. Assembly algorithms...
https://www.biorxiv.org/content/10.1101/2025.04.22.649783v2
5
130
76
Itβs official! π Iβm thrilled to announce that I will be joining MIT as an assistant professor in a shared appointment between Biology, EECS and Schwarzman College of Computing this fall.
5 months ago
9
66
3
Tatta Bio is growing! We are hiring *two positions* in Business Development and Software Engineering to lead the development of AI-enabled scientific software for open science and biological sequence interpretation. Please check out the job postings at
www.tatta.bio/careers
and share widely!
loading . . .
Job Board | Notion
Overview
https://www.tatta.bio/careers
6 months ago
0
5
2
Can LLM agents discover novel protein functions? Introducing Gaia Agent π π€: an AI biologist capable of reasoning across genomic contexts to predict functions of proteins! Gaia Agent is now integrated with Gaia Search at
gaia.tatta.bio
9 months ago
2
38
14
If you are at
#NeurIPS2024
don't miss
@ancornman1.bsky.social
's talk on OMG/gLM2 at 9AM!
@workshopmlsb.bsky.social
East meeting room 11,12
9 months ago
0
12
3
Excited to be at
#NeurIPS
this week.
@ancornman1.bsky.social
will give a spotlight talk at the
@workshopmlsb.bsky.social
on gLM2/OMG! Please reach out if you want to chat about gLM2/OMG/Gaia and our latest projectsπ
www.biorxiv.org/content/10.1...
loading . . .
The OMG dataset: An Open MetaGenomic corpus for mixed-modality genomic language modeling
Biological language model performance depends heavily on pretraining data quality, diversity, and size. While metagenomic datasets feature enormous biological diversity, their utilization as pretraini...
https://www.biorxiv.org/content/10.1101/2024.08.14.607850v2
10 months ago
0
9
3
reposted by
Yunha Hwang
Mitja M. Zdouc
10 months ago
Are you working on natural products? Weβve just released version 4.0 of the MIBiG data standard and repository! It now includes 3059 biosynthetic gene clusters, thanks to the combined efforts of 288 expert contributors. A thread: (1/8)
academic.oup.com/nar/advance-...
loading . . .
MIBiG 4.0: advancing biosynthetic gene cluster curation through global collaboration
Abstract. Specialized or secondary metabolites are small molecules of biological origin, often showing potent biological activities with applications in ag
https://academic.oup.com/nar/advance-article/doi/10.1093/nar/gkae1115/7919508?searchresult=1
4
92
65
reposted by
Yunha Hwang
Amy Lu
10 months ago
1/𧬠Excited to share PLAID, our new approach for co-generating sequence and all-atom protein structures by sampling from the latent space of ESMFold. This requires only sequences during training, which unlocks more data and annotations:
bit.ly/plaid-proteins
π§΅
1
120
40
reposted by
Yunha Hwang
Martin Steinegger πΊπ¦
10 months ago
Our Big Fantastic Virus Database (BFVD) is now published NAR! It contains protein structure predictions of major viral clades, enhanced by petabase-scale homology search and it's explorable on the web. π
bfvd.foldseek.com
πΎ
bfvd.steineggerlab.workers.dev
π
academic.oup.com/nar/advance-...
6
339
132
Hello π¦
#protein
/
#microbio
/
#BioML
community! We are excited to release Gaiaπ, a context-aware protein search tool, extending protein search and discovery capabilities beyond sequence and structure, to include *genomic context*. Search your favorite protein sequences with on
gaia.tatta.bio
10 months ago
10
237
83
you reached the end!!
feeds!
log in