Alex Miller
@alexmillerdb.bsky.social
š¤ 1953
š„ 129
š 373
Database Papers as a Service
reposted by
Alex Miller
South Bay Systems
about 1 hour ago
South Bay Systems returns for its June meetup on June 26th! We'll have two speakers: Yao Yue will talk about her experiences from working on Caching at Twitter and developing Pelikan, and Eric Liang will talk about Databrick's history-based data clustering feature. Register at
luma.com/ogvfeacv
loading . . .
South Bay Systems: Caching & Clustering Ā· Luma
Welcome to another edition of South Bay Systems! This time we bring you two wonderful talks: Yao Yue will speak about her experiences working on Cachingā¦
https://luma.com/ogvfeacv
0
2
2
@jpountz.bsky.social
from your thoughts on ACORN-1, I think youād really like naviX of you havenāt seen it. Kuzu was doing a lot of cool stuff before acquisition š¢
www.vldb.org/pvldb/vol18/...
loading . . .
https://www.vldb.org/pvldb/vol18/p4438-sehgal.pdf
4 days ago
1
4
1
reposted by
Alex Miller
South Bay Systems
13 days ago
The recording from this event is now available!
youtu.be/9LiSWbRASKc
add a skeleton here at some point
0
4
2
It appears thereās no sigmod programming contest for 2026? 16 year streak of cool projects ended. š¢
19 days ago
2
6
0
Achievement unlocked (From āHow to write to SSDsā
arxiv.org/pdf/2603.09927
)
28 days ago
5
31
1
reposted by
Alex Miller
Qian Li
28 days ago
Our next
@southbaysystems.xyz
meetup is on May 26! We're covering one of my favorite topics: databases, and how to use them to make better architectural decisions and build reliable systems. Food and drinks will be provided courtesy of our hosts at PingCAP. Register here:
luma.com/3j9twotu?tk=...
loading . . .
South Bay Systems: Queues & RDBMS Extensibility Ā· Luma
Welcome to another edition of South Bay Systems! This time we bring you two wonderful talks: Himank Chaudhary will explain how the queuing infrastructure inā¦
https://luma.com/3j9twotu?tk=FDqNGL
1
9
6
reposted by
Alex Miller
South Bay Systems
about 2 months ago
South Bay Systems returns for its April meetup on the 30th. This time we have
@cliffclick.bsky.social
giving a walkthrough of his teaching language for Sea of Nodes! Sign up now!
luma.com/nnq9aq27
loading . . .
South Bay Systems: A Simple Guide to Sea of Nodes Ā· Luma
Welcome to another edition of South Bay Systems! This time we have our first compilers talk: a guided tour of Simple, a teaching language meant to showcase theā¦
https://luma.com/nnq9aq27
0
2
3
reposted by
Alex Miller
South Bay Systems
2 months ago
The recording from the last talk is up!
youtu.be/TeFsBVIYBis
loading . . .
Generalized Consensus & āNative Top-K Joins in ParadeDB
YouTube video by South Bay Systems
https://youtu.be/TeFsBVIYBis
0
1
2
If you're an RSS user and a South Bay Systems attendee, I've added an RSS feed for the events at
southbaysystems.xyz/...
3 months ago
0
3
2
reposted by
Alex Miller
Murat (Distributolog)
3 months ago
I built a small interactive visualizer for Hybrid Logical Clocks (HLC). I used Claude Code to put this together quickly and make the behavior visible step by step. Try it here:
muratdem.github.io/hlc-visualiz...
Feedback welcome. Share if useful.
loading . . .
Hybrid Logical Clocks
Here I will write about our recent work on Hybrid Logical Clocks, which provides a feasible alternative to Google's TrueTime. A brief hist...
https://muratbuffalo.blogspot.com/2014/07/hybrid-logical-clocks.html
1
18
4
reposted by
Alex Miller
Qian Li
3 months ago
Our next South Bay Systems meetup will be on March 31. We've got two awesome deep-dive talks from Sugu Sougoumarane (Consensus and Multigres) and
@stuhood.sh
(Full-Text Search and ParadeDB). Food and beverages will be provided, courtesy of our host, Snowflake. Register here:
luma.com/2g3exvjw
loading . . .
South Bay Systems: Consensus & Full Text Search Ā· Luma
Welcome to another edition of South Bay Systems! This time we bring you two wonderful talks: Sugu Sougoumarane will be speaking about deconstructing consensusā¦
https://luma.com/2g3exvjw
0
6
6
[CIDR '25] Linear Elastic Caching via Ski Rental
www.vldb.org/cidrdb/...
You should consider that holding a page in cache costs you, because RAM itself is expensive, and existing page replacement algorithms look at sizing cache independently (via miss-ratio curves).
3 months ago
1
7
0
[arXiv] Dynamic read & write optimization with TurtleKV
arxiv.org/pdf/2509.1...
TurtleKV shows a way to elastically move around the RUM conjecture space depending on what is important at the moment.
3 months ago
1
9
0
reposted by
Alex Miller
PVLDB
3 months ago
Vol:19 No:3 ā Tux: Efficient Drop-in Networking for Database Systems š„ Authors: Xinjing Zhou, Viktor Leis, Xiangyao Yu, Michael Stonebraker š PDF:
https://www.vldb.org/pvldb/vol19/p334-zhou.pdf
0
4
2
[VLDB '26] Garnet: A Next-Generation Cache-Store for Accelerating Applications and Services
www.vldb.org/pvldb/v...
It's fast, durable redis, brought to you by Badrish Chandramouli (et. al), known for other š„ work like FASTER and Bf-tree.
3 months ago
0
8
0
[CIDR '25] Adaptive Factorization Using Linear-Chained Hash Tables
vldb.org/cidrdb/pape...
Adaptive execution + factorization + WCOJ = great paper. The best intro to factorized databases I know of is
www.youtube.com/watc...
.
3 months ago
0
5
0
[VLDB '25] MD-MVCC: Multi-version Concurrency Control for Schema Changes in Azure SQL Database
www.vldb.org/pvldb/v...
A great discussion of the end-to-end impact of allowing multiple versions of schema metadata information to be live concurrently, in a real, production system.
3 months ago
0
8
2
https://scour.ing/
has gotten pretty good at surfacing what new stuff I actually want to read on the internet, better than following subreddits. You can see my feed of mostly database things at
scour.ing/@linearizable
. It surfaces small personal blogs particularly well.
loading . . .
Scour
Scour interesting reads from noisy feeds you can't keep up with and smaller sites you didn't know to check.
https://scour.ing/
3 months ago
1
22
3
The recording finally went great this time. I also demo'd doing a backup audio recording so that we can more reliably get a good recording to post, so hopefully the trend will continue š¤
add a skeleton here at some point
3 months ago
1
8
1
Does anyone know of a good webapp or discord bot or something to help manage a reading group? Something that keeps a list of suggesting things to read, can do voting on the next thing to read, and maybe has a bit of curation support for when the to-read list gets unmanageable?
4 months ago
1
5
0
reposted by
Alex Miller
South Bay Systems
5 months ago
Our next event will be on January 21st, featuring speakers from (the just-finishing) CIDR! Come to Databricks to hear about: * DuckDB on xNVMe by
@pinartozun.bsky.social
of ITU * Spilling in QP by Maximilian Kuschewski of TUM * NPUs in DBs by Alexander Baumstark of TU-Ilmenau
luma.com/8a54z94d
loading . . .
South Bay Systems: Innovative Data Systems Research Ā· Luma
Welcome to another edition of South Bay Systems! This time we bring you three wonderful talks from authors at the just-finishing Conference in Innovative Dataā¦
https://luma.com/8a54z94d
2
12
6
Iāve recently seen multiple, unrelated instances of people referencing Bf-trees. Good job,
@benjdd.com
.
5 months ago
1
9
2
Asking a coding agent to run `cargo build` and read referenced source files for context has made LLMs significantly more helpful and accurate at actually understanding why a compilation error is happening and being able to explain an appropriate fix. Much better than copy-pasting into online LLMs.
5 months ago
1
3
0
Itās frustrating how a bunch of database research from 1980s and before basically doesnāt exist anymore because itās not on the internet, and itās not even in the IEEE/ACM indexes of published work.
7 months ago
0
6
1
Does anyone have links to good writing on the sort of soft skills you learn from working in larger organizations about how to work in larger organizations as an IC? The overall space of soft skills dealing with the pretty common ways that large corporations behave.
7 months ago
2
9
3
Looking forward to reading about which disaggregated architecture HorizonDB aligned itself with
7 months ago
2
4
0
So whatās the feature set difference between pgDog, Neki, and multigres?
7 months ago
1
4
0
reposted by
Alex Miller
South Bay Systems
7 months ago
Our next event is on November 19th at StarTreeās office in downtown Mountain View. Come hear about Morel from Julian Hyde and and Query Optimization as a Service from Yuanyuan Tian!
luma.com/xygolo9c
loading . . .
South Bay Systems: Morel / Query Optimization as a Service Ā· Luma
Welcome to another edition of South Bay Systems! This time we bring you two wonderful talks: Julian Hyde will be speaking about Morel, a new functionalā¦
https://luma.com/xygolo9c
0
1
1
reposted by
Alex Miller
South Bay Systems
7 months ago
The recording from our last South Bay Systems meetup is now available!
youtu.be/f1bz3efUJpM
loading . . .
Apache Pinot on Object Storage & JSON in Apache Doris
YouTube video by South Bay Systems
https://youtu.be/f1bz3efUJpM
1
5
5
reposted by
Alex Miller
Tyler Hillery
8 months ago
@abigalekim.bsky.social
@xiangpeng.systems
and I are kicking off Madison Systems with a coffee chat on Sunday, Nov 9th. Come nerd out on systems!
luma.com/v69tvpla
loading . . .
Madison Systems Coffee Chat Ā· Luma
If youāre working on or are interested in anything in the space of software internals (compilers, databases, operating systems, etc.), come grab a cup ofā¦
https://luma.com/v69tvpla
1
10
6
[PVLDB] Enhancing Transaction Processing through Indirection Skipping
www.vldb.org/pvldb/v...
Whereas VMCache improve pointer swizzing's complexity by removing the swizzling, this work points out that page and frame hints are highly effective, and okay if they're wrong.
8 months ago
0
2
1
This reminded me I've been sitting on draft blog posts about Copy-and-Patch JIT compilation for a while, and so I've finally published the first chunk of it: a minimal tutorial and explanation of how and why Copy-and-Patch actually works. Start at
transactional.blog/copy-and-pat...
add a skeleton here at some point
8 months ago
0
10
3
reposted by
Alex Miller
South Bay Systems
8 months ago
South Bay Systems returns on October 27th at Adobe in downtown San Jose. We have an Analytics-on-Object-Storage double feature this time starring two different Apache projects: Apache Pinot and Apache Doris. (Talk descriptions below.) Register now!
luma.com/9o6bahgc
loading . . .
South Bay Systems: Apache Pinot on Object Storage / Variants in Apache Doris Ā· Luma
Welcome to another edition of South Bay Systems! This time, we'll have a double feature! First we'll have Songqiao Su and Raghav Yadav talking aboutā¦
https://luma.com/9o6bahgc
1
6
4
reposted by
Alex Miller
South Bay Systems
8 months ago
There was an accident with the recording where audio wasn't captured, so instead we can offer a recording from one of Jakob's practice runs on twitch:
www.twitch.tv/videos/25845...
add a skeleton here at some point
1
7
3
reposted by
Alex Miller
Qian Li
8 months ago
Had a fun time at the South Bay Systems meetup last night. Thanks
@yugabytedb.bsky.social
for hosting!
@codedrift.social
gave a great talk on WebAssembly: what it is (and isn't), how it connects to WASI, and promising projects. He cuts through a lot of the hype vs. reality. Recording coming soon.
1
25
9
[ASPLOS'25] Fusion: An Analytics Object Store Optimized for Query Pushdown
www.cs.princeton.edu...
Tightly integrating an Iceberg catalog with an object store means that one could make file-format aware erasure coding decisions, to permit pushing down filters and aggregations.
8 months ago
0
14
4
[VLDB] Towards Principled, Practical Document Database Design
www.vldb.org/pvldb/v...
If you've ever wished that there was a document database equivalent for relational databases' 3NF-style schema design guidance, then this is the paper for you.
9 months ago
1
9
0
[arXiv] On the Theoretical Limitations of Embedding-Based Retrieval
arxiv.org/abs/2508.2...
It's impossible to retrieve all combinations of pairs of documents post-embedding. Thus, there's usecases that vector search won't do well at. Conversely, BM25 excels in these cases.
9 months ago
0
11
2
I text-to-speech papers often, and
www.paper2audio.com
finally did the one thing that I was hoping AI would enable: replace tables/figures/diagrams with a summary of what is being shown. It makes table/diagram-heavy papers actually comprehensible. There's iOS and Android apps, and it's free.
9 months ago
4
10
2
[VLDB] NaviX: A Native Vector Index Design for Graph DBMSs With Robust Predicate-Agnostic Search Performance
www.vldb.org/pvldb/v...
It feels like a follow-on/improvement to ACORN. Also interesting to see HNSW built directly on a graph database working well.
9 months ago
0
3
0
Someone should go implement a bulk loading into btree mechanism relying on
man7.org/linux/man-pa...
to be able to prepare a tree of data, and then just atomically drop it into the main btree file as a sub-tree, as that'd be pretty cool to read about.
10 months ago
1
4
0
Thereās surprisingly been no good citation for follower reads and the trade-offs therein. Super excited that this finally got published.
law-theorem.com
had āComing soon!ā for a few years š
add a skeleton here at some point
10 months ago
0
22
2
For anyone else trying to catch up on DBSP, my recommended flow of learning is: 1. Watch the talk:
www.youtube.com/watch?v=omOH...
(h/t
@wslim.bsky.social
) 2. Read the spec/book:
mihaibudiu.github.io/work/dbsp-sp...
(h/t
@avi.im
) 3. Read the VLDB paper List is ordered by assumed knowledge of reader
10 months ago
1
25
4
In the Postgres-style MVCC vs MySQL-style MVCC debates, I'd really love to see an implementation of time-separated btrees (
dl.acm.org/doi/pdf/10.1...
) evaluated. It's CoW-BTree style "your path down the tree prunes out versions you don't want to see", but update-in-place and copies only on splits.
10 months ago
1
15
5
reposted by
Alex Miller
Marc Brooker
10 months ago
People often ask me about the differences in architecture between Amazon Dynamo (the 2007 SOSP paper), DynamoDB (the AWS serverless NoSQL database), and Aurora DSQL (the AWS serverless SQL databases). I memoized the response on my blog.
brooker.co.za/blog/2025/08...
loading . . .
Dynamo, DynamoDB, and Aurora DSQL - Marc's Blog
https://brooker.co.za/blog/2025/08/15/dynamo-dynamodb-dsql.html
3
51
10
[arXiv] Theseus: A Distributed and Scalable GPU-Accelerated Query Processing Platform Optimized for Efficient Data Movement
arxiv.org/pdf/2508.0...
Great to see that Voltron Data folk writing about their GPU database!
10 months ago
0
7
0
reposted by
Alex Miller
Justin
10 months ago
Todayās NULL BITMAP is a very special oneāI have been doing NULL BITMAP every week for two years, and to celebrate, this week I got a collection of friends to put together a printable zine of articles. I hope you enjoy it!
buttondown.com/jaffray/arch...
2
18
6
To randomly sample a number of operations, one pulls from a PRNG.
github.com/buildup-d...
instead shows a cute trick for defining a stateless PRNG: pull RDTSC, run it through a quick hash to scramble the bits (e.g. rapidhash). Cache-miss-free, but you lose determinism in tests.
loading . . .
mysql-server-RP/include/my_rnd.h at 9d88f21761ab9ffe34a4b5831c97e87edfb9c53a Ā· buildup-db/mysql-server-RP
MySQL RP (Restore Performance) is modified version of MySQL Community, to restore performance equal to or better than previous major versions. - buildup-db/mysql-server-RP
https://github.com/buildup-db/mysql-server-RP/blob/9d88f21761ab9ffe34a4b5831c97e87edfb9c53a/include/my_rnd.h#L67
10 months ago
1
0
0
reposted by
Alex Miller
Qian Li
10 months ago
We had our biggest
@southbaysystems.xyz
meetup yet last night! Thanks to everyone who came, and thanks to Databricks for hosting!
@andypavlo.bsky.social
discussed the 50-year history of database tuning, applying AI/ML to the problem, and the future of auto-tuning (agentic reasoning, of course).
2
23
8
If you have a blog hosted on cloudflare pages and any part of your css looks missing only on safari, it's because of
www.cloudflarestatus.com/incidents/ps...
, and you have to go purge the cache to get rid of the cached wrongly-compressed asset files.
loading . . .
Cloudflare Pages: Compression issues with custom hostnames
Cloudflare's Status Page - Cloudflare Pages: Compression issues with custom hostnames.
https://www.cloudflarestatus.com/incidents/psbtf5g99qjc
10 months ago
0
2
0
Load more
feeds!
log in