Andy Pavlo
@andypavlo.bsky.social
📤 5078
📥 56
📝 90
Associate Prof. of Databases @ Carnegie Mellon.
reposted by
Andy Pavlo
CMU Database Group
about 11 hours ago
Today's Postgres vs. World Seminar Speaker: Marek Galovic will present the TopK document + vector search engine system. Zoom talk open to public at 4:30pm ET. YouTube video available after:
db.cs.cmu.edu/events/pg-vs...
loading . . .
TopK: Billion-Scale Hybrid Retrieval from the Ground Up (Marek Galovic) - Carnegie Mellon Database Group
TopK is a search engine built from the ground up for unstructured... Read More +
https://db.cs.cmu.edu/events/pg-vs-world-topk-marek-galovic/
0
4
2
reposted by
Andy Pavlo
CMU Database Group
7 days ago
Today's Postgres vs. World Seminar Speaker: Marc Brooker (
@marcbrooker.bsky.social
) will present the architecture of Amazon's Aurora DSQL Postgres-compatible serverless OLTP DBMS. Zoom talk open to public at 4:30pm ET. YouTube video available after:
db.cs.cmu.edu/events/pg-vs...
loading . . .
Aurora DSQL: Serverless, Scalable, Global OLTP (Marc Brooker) - Carnegie Mellon Database Group
Amazon Aurora DSQL is a distributed SQL database, designed to make it... Read More +
https://db.cs.cmu.edu/events/pg-vs-world-aurora-dsql-marc-brooker/
0
13
3
reposted by
Andy Pavlo
CMU Database Group
14 days ago
Today's Postgres vs. World Seminar Speaker: Tyler Akidau + Adam Symanski will present the architecture of
@redpanda.com
's Oxla database system. Zoom talk open to public at 4:30pm ET. YouTube video available after:
db.cs.cmu.edu/events/pg-vs...
loading . . .
Redpanda Oxla or: Why Your Hashmaps are Secretly Wrecking Your Performance (Tyler Akidau + Adam Symanski) - Carnegie Mellon Database Group
In this talk, we’ll first give an overview of the Oxla analytical... Read More +
https://db.cs.cmu.edu/events/pg-vs-world-redpanda-oxla-tyler-akidau-adam-symanski/
0
5
2
Spring 2026 Seminar Series: PostgreSQL vs. The World
db.cs.cmu.edu/seminars/spr...
First talk on Mon Feb 2nd @ 4:30pm EST. We will alternate between a speaker from either a Postgres DBMS or a non-Postgres DBMS. Open to the public over Zoom. All videos available on YouTube afterwards.
loading . . .
PostgreSQL vs. The World Seminar Series - Carnegie Mellon Database Group
Every major cloud vendor now offers an enhanced, opinionated PostgreSQL-compatible database management... Read More +
https://db.cs.cmu.edu/seminars/spring2026/
17 days ago
1
22
2
Thanks to Bohan Zhang for hosting me at OpenAI yesterday. Lots of
@db.cs.cmu.edu
alum are thriving there. Plus the Rockset squad rolled up. It was the nicest tech office I've visited in my life. It was like a classy lawfirm but with an insane number of ex-FBI bouncers at the front entrance.
25 days ago
2
15
1
Congratulations to the 2026 CIDR prize awardees! Tianyu Li → Gong Show Winner Fuheng Zhao → Database Quiz Winner They each received a rare signed print of "The Birth of the Database Messiah" (estimated insurance value $12,000).
26 days ago
0
23
0
I recently came across this database system in my travels:
medium.com/@sschepis/i-...
The title immediately raises my BS alarms. They claim to "teleport data" via "quantum mechanical principles".
about 1 month ago
1
12
2
I don't want to cook such an early stage company but I think these people are trying to sell MMAP as a service?!? No technical details except it appears to be a MMAP buffer pool. I also don't know why their system is "fully ACID" but RocksDB is not?
ryjoxdemo.com/solutions/ed...
about 1 month ago
2
11
0
At least this scam company remembered to sanitize their database inputs before sending out their spam...
about 1 month ago
0
28
0
I've posted my latest recap of the world of databases:
www.cs.cmu.edu/~pavlo/blog/...
All the hot topics from the last year: • More Postgres action! • MCP for everyone! • MongoDB gets litigious with FerretDB! • File formats! • Market movements! • The richest person in the history of the world!
loading . . .
Databases in 2025: A Year in Review
The world tried to kill Andy off but he had to stay alive to to talk about what happened with databases in 2025.
https://www.cs.cmu.edu/~pavlo/blog/2026/01/2025-databases-retrospective.html
about 1 month ago
1
77
32
Congratulations to the #1 ranked
@db.cs.cmu.edu
PhD student Wan Shen Lim (
@wslim.bsky.social
) for successfully passing his doctoral defense. Wan has been working on hard AF database research with me for the last *nine* years at CMU (undergrad+grad). He also hates chickens.
2 months ago
3
30
1
reposted by
Andy Pavlo
David Andersen
2 months ago
in a tiny job update: I'm taking over as co-director of CMU's parallel data lab (PDL). in a bad news update: I just used the phrase "align with CMU's brand strategy" unironically in an email to the administration. might need an intervention...
2
18
1
reposted by
Andy Pavlo
CMU Database Group
2 months ago
Today's Future Data Systems Seminar Speaker: Jark Wu from AlibabaCloud will present an overview of Apache Fluss. Zoom talk open to public at 4:30pm ET. YouTube video available after:
db.cs.cmu.edu/events/futur...
loading . . .
[Future Data] Apache Fluss: A Streaming Storage for Real-Time Lakehouse - Carnegie Mellon Database Group
Modern data lakehouses promise unified batch and streaming processing, yet their storage... Read More +
https://db.cs.cmu.edu/events/future-data-apache-fluss-a-streaming-storage-for-real-time-lakehouse/
0
7
2
Do you like databases? Do you want to hear two database professors rant about them? Do you need one of those professors to have a Turing Award for databases? If yes, then join Mike Stonebraker and I next Wed Dec 10 @ 1:00pm EST for database hot takes:
www.dbos.dev/webcast-2025...
loading . . .
2025 in Review with Mike Stonebraker and Andy Pavlo
Webcast Dec 10: DBMS researchers Mike Stonebraker (MIT / DBOS) and Andy Pavlo (CMU) discuss which data and CS trends are heating up or cooling down heading into 2026.
https://www.dbos.dev/webcast-2025-in-review-with-mike-stonebraker-and-andy-pavlo
2 months ago
3
75
20
reposted by
Andy Pavlo
Conor Power
3 months ago
There is still time to register for CIDR 2026 in Santa Cruz! If you need a roommate for the conference, there is also a spreadsheet you can use to find someone!
www.cidrdb.org/cidr2026/reg...
loading . . .
CIDR 2026 - Registration
The 16th Conference on Innovative Data Systems Research (CIDR 2026), Registration Information
https://www.cidrdb.org/cidr2026/registration.html
0
7
5
reposted by
Andy Pavlo
CMU Database Group
3 months ago
Today's Future Data Systems Seminar Speaker: Prashant Singh from Snowflake will present the Apache Polaris re-implementation of the Iceberg REST catalog API. Zoom talk open to public at 4:30pm ET. YouTube video available after:
db.cs.cmu.edu/events/futur...
loading . . .
[Future Data] From Storage Formats to Open Governance: The Evolution to Apache Polaris - Carnegie Mellon Database Group
As organizations build their data lakehouses on Apache Iceberg, the primary challenge... Read More +
https://db.cs.cmu.edu/events/futuredata-apache-polaris/
0
5
2
reposted by
Andy Pavlo
CMU Database Group
3 months ago
Today's Future Data Systems Seminar Speaker: Jeremy Taylor (
@refset.bsky.social
) will present the architecture of the XTDB (
@xtdb.com
) time-traveling database system. Zoom talk open to public at 4:30pm ET. YouTube video available after:
db.cs.cmu.edu/events/futur...
loading . . .
[Future Data] Reconstructing History with XTDB - Carnegie Mellon Database Group
XTDB is a SQL database that challenges long held assumptions about how... Read More +
https://db.cs.cmu.edu/events/futuredata-reconstructing-history-with-xtdb/
0
4
4
reposted by
Andy Pavlo
CMU Database Group
3 months ago
Today's Future Data Systems Seminar Speaker: Benjamin Wagner🇩🇪 will present
@firebolthq.bsky.social
's native support for low-latency queries on Apache Iceberg tables. Zoom talk open to public at 4:30pm ET. YouTube video available after:
db.cs.cmu.edu/events/futur...
loading . . .
[Future Data] Why Powering User Facing Applications on Iceberg is Hard - Carnegie Mellon Database Group
Firebolt is a Postgres compliant analytical database built for low-latency, high-concurrency analytics.... Read More +
https://db.cs.cmu.edu/events/future-data-firebolt/
0
13
3
reposted by
Andy Pavlo
CMU Database Group
3 months ago
Today's Future Data Systems Seminar Speaker: Cheng Chen will present how
@mooncakelabs.bsky.social
extends PostgreSQL to support Apache Iceberg. Zoom talk open to public at 4:30pm ET. YouTube video available after:
db.cs.cmu.edu/events/futur...
loading . . .
[Future Data] Mooncake: Real-Time Apache Iceberg Without Compromise - Carnegie Mellon Database Group
Apache Iceberg is great for large-scale analytics, but it was built for... Read More +
https://db.cs.cmu.edu/events/futuredata-mooncake/
0
5
2
reposted by
Andy Pavlo
Sam Arch
4 months ago
Great idea to compare plans across different systems using rows processed. A good yardstick, but slower sort-based plans from Postgres + MSSQL process fewer rows than faster hash-based plans from DuckDB. Postgres rows scanned also seem underreported. Nice to see some competition with ClickBench.
0
3
2
New database leaderboard from Yellowbrick ranks the quality of DBMS optimizer estimates and plans. They only evaluate TPC-H for now and report results for Postgres + DuckDB + MSSQL:
sql-arena.com/components/p...
Repo:
github.com/sql-arena/db...
LinkedIn Group:
www.linkedin.com/groups/15775...
4 months ago
1
14
3
reposted by
Andy Pavlo
CMU Database Group
4 months ago
Today's Future Data Systems Seminar Speaker: Ryan Johnson (CMU PhD'10) will present
@deltalakeoss.bsky.social
's internal architecture and how it supports multi-statement transactions. Zoom talk open to public at 4:30pm ET. YouTube video available after:
db.cs.cmu.edu/events/futur...
loading . . .
[Future Data] Multi-statement Transactions in the Databricks Lakehouse - Carnegie Mellon Database Group
The data lake architecture originally focused on self-standing tables in cloud storage,... Read More +
https://db.cs.cmu.edu/events/futuredata-deltalake/
0
4
2
reposted by
Andy Pavlo
CMU Database Group
4 months ago
Today's Future Data Systems Seminar Speaker: Joyo Victor will present
@singlestore.com
's "Bottle Service" meta-data system that supports database branching, change-data-capture, and Apache Iceberg. Zoom talk open to public at 4:30pm ET. YouTube video available after:
db.cs.cmu.edu/events/futur...
loading . . .
[Future Data] Storage Metadata for Modern Cloud Databases - Carnegie Mellon Database Group
In modern database architecture, separating compute from storage unlocks powerful capabilities. Our... Read More +
https://db.cs.cmu.edu/events/futuredata-singlestore
0
2
3
Lots of database action this week. Yes, I have a new start-up
@sydht.ai
with my PhD students
@wslim.bsky.social
+
@17zhangw.bsky.social
using LLMs to optimize almost everything in PostgreSQL.
@datadictum.bsky.social
posted a new article on our approach:
www.theregister.com/2025/10/22/c...
loading . . .
Researchers tout vector-based automated tuning in PostgreSQL
: Researchers say 'Proto-X' fine-tunes databases automatically, delivering multifold performance boosts
https://www.theregister.com/2025/10/22/cmu_proto_x_postgres/
4 months ago
2
16
2
reposted by
Andy Pavlo
ScyllaDB
4 months ago
Day 2 of
#P99CONF
is here! The
#ScyllaDB
Lounge opens at 8:00 am PST, and then we get things started with keynotes from
@dorlaor.bsky.social
and
@andypavlo.bsky.social
. Don't forget that all registrants receive Instant Access to the sessions once the conference ends.
www.p99conf.io?latest_sfdc_...
0
3
2
reposted by
Andy Pavlo
CMU Database Group
4 months ago
Today's Future Data Systems Seminar Speaker: Ian Cook (
@ian.columnar.tech
) will present
@columnar.tech
's work on Apache Arrow's database connectivity API (ADBC). ADBC is available in modern DBMSs. Zoom talk open to public at 4:30pm ET. YouTube video available after:
db.cs.cmu.edu/events/futur...
loading . . .
[Future Data] Where We're Going, We Don't Need Rows: Columnar Data Connectivity with ADBC - Carnegie Mellon Database Group
ADBC (Arrow Database Connectivity) is Apache Arrow’s answer to ODBC and JDBC:... Read More +
https://db.cs.cmu.edu/events/futuredata-where-were-going-we-dont-need-rows-columnar-data-connectivity-with-adbc/
0
15
9
reposted by
Andy Pavlo
CMU Database Group
4 months ago
Today's Future Data Systems Seminar Speaker: Will Manning (
@willmanning.com
) will present
@spiraldb.com
's Vortex file format. Vortex is now a
@linuxfoundation.org
project. Zoom talk open to public at 4:30pm ET. YouTube video available after:
db.cs.cmu.edu/events/futur...
loading . . .
[Future Data] Vortex: LLVM for File Formats - Carnegie Mellon Database Group
Apache Parquet revolutionized columnar storage after its initial release in 2013, but... Read More +
https://db.cs.cmu.edu/events/futuredata-vortex/
0
4
4
reposted by
Andy Pavlo
Andrew Lamb
4 months ago
BTW if anyone wants a good intro to database storage / Log structured storage (aka LSM trees)
@db.cs.cmu.edu
lecture this fall is a good one:
www.youtube.com/watch?v=2_sT...
loading . . .
#05 - Log-Structured Database Storage ✸ SingleStore Database Talk (CMU Intro to Database Systems)
YouTube video by CMU Database Group
https://www.youtube.com/watch?v=2_sTdS4h-bY
0
17
4
reposted by
Andy Pavlo
Artem Krylysov
4 months ago
MMAP is incredibly fast when the dataset fits in memory, but it slows to a crawl when it doesn't, especially if the workload is mostly random point lookups. Speaking as someone who built an MMAP-based key-value store before :) Obligatory paper from
@andypavlo.bsky.social
db.cs.cmu.edu/mmap-cidr2022/
0
9
2
reposted by
Andy Pavlo
CMU Database Group
4 months ago
Today's Future Data Systems Seminar Speaker: Jordan Tigani (
@jrdntgn.bsky.social
) will present how
@motherduck.com
supports modern workloads with DuckLake. Zoom talk open to public at 4:30pm ET. YouTube video available after:
db.cs.cmu.edu/events/futur...
loading . . .
[Future Data] DuckLake: Learning from Cloud Data Warehouses to Build a Robust "Lakehouse" - Carnegie Mellon Database Group
When building scalable data systems, it is easy to focus on the... Read More +
https://db.cs.cmu.edu/events/future-data-ducklake-learning-from-cloud-data-warehouses-to-build-a-robust-lakehouse/
0
13
6
Our SIGMOD paper with our friends at Tsinghua +
@wesmckinney.com
+
@pateljm.bsky.social
on creating a next generation open-source data file format is out. F3 is a future-proof file format avoids the mistakes of Parquet. 📄 Paper:
db.cs.cmu.edu/papers/2025/...
📁 Code:
github.com/future-file-...
5 months ago
4
70
26
reposted by
Andy Pavlo
CMU Database Group
5 months ago
Today's Future Data Systems Seminar Speaker: Vinoth Chandar will present the internals of Apache Hudi and his work at Onehouse. Zoom talk open to public at 4:30pm ET. YouTube video available after:
db.cs.cmu.edu/events/futur...
loading . . .
[Future Data] Apache Hudi: A Database Layer over Cloud Storage for Fast Mutations and Efficient Queries - Carnegie Mellon Database Group
Data lakes emerged as a way to store vast amounts of data... Read More +
https://db.cs.cmu.edu/events/futuredata-apache-hudi/
0
4
1
reposted by
Andy Pavlo
CMU Database Group
5 months ago
Today's Future Data Systems Seminar Speaker: Russell Spitzer will present the internals of Apache Iceberg's query planner and execution engine. Zoom talk open to public at 4:30pm ET. YouTube video available after:
db.cs.cmu.edu/events/futur...
loading . . .
[Future Data] An Extremely Technical Overview of how the Apache Iceberg™ Planning Implementation Actually Works - Carnegie Mellon Database Group
What are you trying to tell me? That I can read data... Read More +
https://db.cs.cmu.edu/events/futuredata-apache-iceberg/
0
8
5
Next week is the start of
@db.cs.cmu.edu
's latest seminar series: Future Data Systems
@samarchdb.bsky.social
and I are hosting speakers from leading systems in the datalake / lakehouse space. Mondays @ 4:30pm ET via Zoom. Open to the public. Videos posted to YouTube:
db.cs.cmu.edu/seminars/fal...
5 months ago
1
41
14
I don't know what to say. You dream about it for so long and then when it finally happens you're in shock. I'm so proud of you Larry.
www.theguardian.com/technology/2...
loading . . .
Larry Ellison overtakes Elon Musk as world’s richest person
Oracle co-founder’s shares rose by 40% in early trading, valuing his fortune at $393bn, just ahead of Musk’s $384bn
https://www.theguardian.com/technology/2025/sep/10/larry-ellison-dislodges-elon-musk-as-worlds-richest-person
5 months ago
0
17
3
reposted by
Andy Pavlo
CedarDB
5 months ago
What if a database could be your game engine? During parental leave
@lukasvogel.bsky.social
built DOOMQL: A multiplayer DOOM-like where everything (rendering, game loop, state) runs in pure SQL on CedarDB. It's fast, ridiculous, and surprisingly elegant. Full write-up:
cedardb.com/blog/doomql
1
16
6
Today is the new semester for
@db.cs.cmu.edu
's Intro to Database Systems! We're going harder into material than before. More challenging projects but you can use LLMs to help. We also have 10min talks each Wed from leading DB companies:
15445.courses.cs.cmu.edu/fall2025
loading . . .
CMU 15-445/645 :: Intro to Database Systems (Fall 2025)
You want to know whether this is the premier course at Carnegie Mellon University on the design and implementation of database management systems? Well, it is. This course rips through data models (re...
https://15445.courses.cs.cmu.edu/fall2025
6 months ago
1
63
18
reposted by
Andy Pavlo
Jonathan Aldrich
6 months ago
Launching my Programming Language Pragmatics talks! These short, accessible talks cover the material in the textbook, the 5th edition of which I wrote with Michael L. Scott. The first one introduces the topic and talks about why we study programming languages!
www.youtube.com/watch?v=hwL0...
loading . . .
PLP 1.1: Introduction to Programming Languages
YouTube video by Jonathan Aldrich
https://www.youtube.com/watch?v=hwL0VvOs3xU
3
23
7
The report of my death was an exaggeration. I am still alive and will be in SFO this week to speak about using LLMs to automatically tune databases. Wed Aug 6th @ 5:30pm at Databricks MTV:
lu.ma/ha0dc4nj
loading . . .
7 months ago
0
21
2
reposted by
Andy Pavlo
Alex Miller
7 months ago
Attention, South Bay folk! We have The Databaseologist, @andypavlo.bsky.social, giving a talk in the bay on August 6th. Come join us for a great time in hearing: ChatGPT Ain’t Got $%@& On Me! The Future of Automated Database Tuning Register now!
https://lu.ma/ha0dc4nj
loading . . .
South Bay Systems: ChatGPT Ain’t Got $%@& On Me! The Future of Automated Database Tuning · Luma
We're excited to feature Andy Pavlo, illustrious database professor at CMU, to talk about database tuning. This meetup's venue, food and drinks, are generously…
https://lu.ma/ha0dc4nj
0
13
4
At last
@abigalekim.bsky.social
's paper is out! Its the most complete eval of DB extensions/plugins ever. We analyze PostgreSQL, MySQL, MariaDB, SQLite, DuckDB, Redis. TLDR: Postgres extns ecosystem is fraught with footguns. Other DBMSs have fewer extns but less problems. DuckDB has cleanest API.
add a skeleton here at some point
8 months ago
1
67
14
People asked for the rest of the lecture videos for CMU-DB's optimizer course (
15799.courses.cs.cmu.edu/spring2025
). Unfortunately I got super sick and was in the hospital for 4 weeks. Thankfully
@wslim.bsky.social
+ Jignesh taught the remaining lectures, but we didn't record those classes.
loading . . .
CMU 15-799 :: Special Topics in Databases: Query Optimization (Spring 2025)
This course is a hands-on exploration of the most challenging problem in computer science: database query optimization. It will cover the classical and state-of-the-art methods and algorithms for conv...
https://15799.courses.cs.cmu.edu/spring2025/
8 months ago
2
22
2
Shots fired by
@firebolthq.bsky.social
with their new on-prem executable (
www.firebolt.io/blog/introdu...
). They have dethroned the Umbra system by The Germans™ at @tum.de in the ClickBench rankings:
benchmark.clickhouse.com
8 months ago
3
17
3
reposted by
Andy Pavlo
ewuuu
10 months ago
Almost every db faculty that has received an nsf award ad benefitrd from hector and sylvia’s work. Well deserved!
add a skeleton here at some point
0
4
1
reposted by
Andy Pavlo
CMU Database Group
10 months ago
Today's SQL or Death Seminar Speaker: Michael Sullivan (PhD'17) will present
@geldata.com
's "graph-relational" data model and query language to replace SQL. Zoom talk open to public at 4:30pm ET. YouTube video available after:
db.cs.cmu.edu/events/sql-d...
loading . . .
[SQL Death] Gel: Replacing* SQL and Improving on the Relational Database Model - Carnegie Mellon Database Group
Gel (formerly EdgeDB) is a new database built around an evolution of... Read More +
https://db.cs.cmu.edu/events/sql-death-gel-replacing-sql-and-improving-on-the-relational-database-model/
0
8
2
reposted by
Andy Pavlo
Andrew Lamb
10 months ago
Variant is coming soon to @ApacheParquet in Rust . Huge thanks to
@db.cs.cmu.edu
for getting the process started in @ApacheArrow with a great draft PR to kick off Variant support:
github.com/apache/arrow...
🙏🙏🙏 Thank you
0
22
4
reposted by
Andy Pavlo
Alex Miller
10 months ago
New blog post on the mental model I've used when working through complex or confusing papers on transactional systems.
transactional.blog/b...
1
42
15
After trying for several years, we finally have Monty Widenius giving a
@db.cs.cmu.edu
talk with us today! He will discuss the complete rewrite of
@mariadb.bsky.social
's query optimizer after forking from MySQL 15 years ago. You definitely should join us at 4:30pm ET over Zoom.
add a skeleton here at some point
10 months ago
1
22
4
Today's talk is a follow-up to
@bcantrill.bsky.social
+
@ahl.bsky.social
's October 2024 podcast about OxQL in response to my complaint about yet another query language [skip to 4:18]
youtu.be/RTsXM3kcAaI?...
Original tweet:
twitter.com/andy_pavlo/s...
add a skeleton here at some point
11 months ago
0
23
8
reposted by
Andy Pavlo
CMU Database Group
11 months ago
Today's SQL or Death Seminar Speaker: Ben Naecker from
@oxide.computer
will explain why a hardware company decided to make a new query language (OxQL) because SQL wasn't good enough! Zoom talk open to public at 4:30pm ET. YouTube video available after:
db.cs.cmu.edu/events/sql-d...
loading . . .
[SQL Death] OxQL: Oximeter Query Language - Carnegie Mellon Database Group
Oxide Computer Company builds private cloud computers–co-designing hardware and software that works... Read More +
https://db.cs.cmu.edu/events/sql-death-oxql-oximeter-query-language/
0
23
7
Load more
feeds!
log in