Hannes Mühleisen
@hannes.muehleisen.org
📤 6473
📥 999
📝 126
I like databases and boats. Co-creator of
@duckdb.org
, Co-Founder and CEO DuckDB Labs.
reposted by
Hannes Mühleisen
DuckDB
11 days ago
You can now query Iceberg catalogs with DuckDB – from the convenience of your browser. In our latest blog post, we break down how DuckDB's WebAssembly client makes this possible with zero setup and showcase it in action using Amazon S3 Tables. 👉
duckdb.org/2025/12/16/i...
1
30
5
reposted by
Hannes Mühleisen
Markus Winand
11 days ago
Adieu Apache Derby, Welcome DuckDB
modern-sql.com/blog/2025-12...
loading . . .
DuckDB Coverage
Apache Derby is retired, ModernSQL now covers DuckDB instead
https://modern-sql.com/blog/2025-12/derby-duckdb
1
20
5
reposted by
Hannes Mühleisen
Carlo Piovesan
11 days ago
If you are into Iceberg or DuckDB, this is a major step: you can access existing Iceberg datasets, while they are appended on or updated by other users (possibly they can also be clicking buttons in a browser). This has been very fun, was looking forward to be able to properly share, thanks flock!
add a skeleton here at some point
2
16
2
Found this fun bug in
#Google
Maps. The Austin airport jumps away and back quite far just by zooming...
11 days ago
0
2
0
reposted by
Hannes Mühleisen
DuckDB
15 days ago
In seven weeks, we will host the first DuckDB Developer Meeting in Amsterdam. We have four exciting talks lined up on building extensions in DuckDB, encryption in DuckDB, DuckPL and GizmoEdge. For more details and the registration links, please head to
duckdb.org/events/2026/...
0
8
4
reposted by
Hannes Mühleisen
DuckDB
18 days ago
We are happy to announce DuckDB v1.4.3 LTS, our latest patch release. Along with bugfixes, this release ships native extensions and Python support for Windows ARM64. Head to
duckdb.org/2025/12/09/a...
for the announcement blog post and installation instructions.
0
24
7
reposted by
Hannes Mühleisen
Kyle Walker
21 days ago
This is the most exciting time ever to be working in data, and I'm not talking about AI. 3 years ago, I wrote a database-centric guide in my book for analyzing the full 92 million record 1910 Census. Now, with
#rstats
and @duckdb? Analyze those 92 million rows in seconds.
0
32
3
reposted by
Hannes Mühleisen
Markus Winand
29 days ago
modern-sql.com
now covers DuckDB.
loading . . .
Modern SQL: A lot has changed since SQL-92
SQL has evolved. Beyond the relational model. Discover it now.
https://modern-sql.com/
2
46
14
reposted by
Hannes Mühleisen
DuckDB
29 days ago
In DuckDB v1.4.2, we shipped a number of features and improvements to the DuckDB-Iceberg extension: insert, update, and delete statements are all supported now. Read Tom Ebergen's new article on these features at
duckdb.org/2025/11/28/i...
1
29
5
reposted by
Hannes Mühleisen
DuckDB
about 1 month ago
DuckDB v1.4 introduced the much-requested feature of database encryption. In our new blog post,
@ccfelius.bsky.social
and
@hannes.muehleisen.org
explain how the encryption works under the hood:
duckdb.org/2025/11/19/e...
0
28
7
reposted by
Hannes Mühleisen
Qiusheng Wu
about 1 month ago
I’m thrilled to share that my new book (Spatial Data Management with DuckDB) is now published! 🎉 At 430 pages, this book provides a practical, hands-on guide to scalable geospatial analytics and visualization using DuckDB. All code examples are open-source and freely available on GitHub.
3
55
13
reposted by
Hannes Mühleisen
DuckDB
about 1 month ago
🚀 We released DuckDB v1.4.2, the second patch release of our LTS edition. 🔎 We are shipping new Iceberg features, improved logger/profiler integration and several bugfixes. The new DuckDB version can also read and write Vortex files. 📖 For more details, read
duckdb.org/2025/11/12/a...
0
39
6
reposted by
Hannes Mühleisen
Ben Schneider
about 2 months ago
This profile in ‘Significance’ on DuckDB co-founder Hannes Mühleisen is quite interesting, and has helpful insights about data quality and the changing meaning of “big data.” Also some good professional advice in here for statisticians.
academic.oup.com/jrssig/artic...
loading . . .
Is big data dead?
Abstract. Data, ducks and statistics – Sandra Alba gathers dispatches from Amsterdam and Auckland
https://academic.oup.com/jrssig/article-abstract/22/6-7/55/8275182
0
8
2
reposted by
Hannes Mühleisen
Development Seed
about 2 months ago
We took Canada’s Spatial Access Measures dataset (big, clunky CSVs) → turned it into a single GeoParquet file. Add DuckDB-WASM +
deck.gl
& you get - instant queries - smooth maps - no backend Public data, but actually usable.
developmentseed.org/blog/2025-10...
@saadiqmohiuddin.bsky.social
loading . . .
0
21
5
reposted by
Hannes Mühleisen
Marco Slot
about 2 months ago
pg_lake just went open source! (Apache 2.0) pg_lake is a set of extensions (from Crunchy Data Warehouse) that add comprehensive Iceberg support and data lake access to Postgres, with
@duckdb.org
transparently integrated into the query engine. Announcement blog:
www.snowflake.com/en/engineeri...
1
27
6
reposted by
Hannes Mühleisen
DuckDB
about 2 months ago
The PyData Amsterdam 2025 keynote “Minus Three Tier: Data Architecture Turned Upside Down” by
@hannes.muehleisen.org
is out now.
www.youtube.com/watch?v=DxwD...
loading . . .
KEYNOTE: Hannes Mühleisen - Data Architecture Turned Upside Down | PyData Amsterdam 2025
YouTube video by PyData
https://www.youtube.com/watch?v=DxwDaoUijTc
1
25
5
reposted by
Hannes Mühleisen
DuckDB
2 months ago
🎞️ 𝘊𝘢𝘯 you store a movie in DuckDB? In today's blog post,
@hannes.muehleisen.org
shows how to store a movie as a table encoding the RGB codes pixel-by-pixel, and how to process it:
duckdb.org/2025/10/27/m...
Now, whether you 𝘴𝘩𝘰𝘶𝘭𝘥 store a movie in DuckDB... we'll leave that to your judgment.
3
34
7
reposted by
Hannes Mühleisen
DuckDB
2 months ago
📣 New blog post by
@dtenwolde.bsky.social
. 🕸️ In this post, we show how to use DuckDB and the DuckPGQ community extension to analyze financial data for fraudulent patterns with the SQL/PGQ graph syntax that's part of SQL:2023. 📖 Visit
duckdb.org/2025/10/22/d...
to read the post.
0
27
7
reposted by
Hannes Mühleisen
Dirk Eddelbuettel
2 months ago
duckdb-mlpack 0.0.2: mlpack is now a duckdb community extension Bringing mlpack machine learning to duckdb SQL
dirk.eddelbuettel.com/blog/2025/10...
1
28
9
reposted by
Hannes Mühleisen
DuckDB
2 months ago
🇫🇮 We are hosting a pub session next week during the
@helsinkidataweek.bsky.social
, where you can chat with DuckDB's co-creator,
@hannes.muehleisen.org
and have a drink with members of the DuckDB community. 🎟️ Sign up on Luma:
luma.com/s5sl9qxx
0
15
1
reposted by
Hannes Mühleisen
Dirk Eddelbuettel
2 months ago
ML quacks: Combining duckdb and mlpack
dirk.eddelbuettel.com/blog/2025/10...
A 'minimally viable product / demo' of extending
@duckdb.org
with
#mlpack
0
13
2
reposted by
Hannes Mühleisen
Torsten „Teggy“ Grust
2 months ago
I'm grateful that Jack Waudby gave me the chance to set the CTE record straight on his
@disseminatepodcast.bsky.social
. Hear us talk about what you can do with iterative queries in SQL, how efficient variants of recursion in SQL found their way into
@duckdb.org
, and how trampolines come into play.
1
7
2
reposted by
Hannes Mühleisen
PyLadies Amsterdam
3 months ago
Learn how to build powerful yet lightweight
#data
workflows using
#Python
,
#DuckDB
, and
#Smallpond
with Valery C. Briz,
#Pythonista
, senior
#dataengineer
on the 23rd of October in our
#online
#workshop
18:00-19:30 CEST Register here:
www.meetup.com/pyladiesams/...
0
2
1
reposted by
Hannes Mühleisen
DuckDB
3 months ago
🚀 We released DuckDB v1.4.1, the first bugfix release of our LTS edition. 🔎 We expect LTS users to be particularly curious about changes in the system, so we wrote up a short blog post highlighting the most important fixes and improvements.
duckdb.org/2025/10/07/a...
loading . . .
Announcing DuckDB 1.4.1 LTS
Today we are releasing DuckDB 1.4.1, the first bugfix release of our LTS edition.
https://duckdb.org/2025/10/07/announcing-duckdb-141
1
23
3
reposted by
Hannes Mühleisen
CMU Database Group
3 months ago
Today's Future Data Systems Seminar Speaker: Jordan Tigani (
@jrdntgn.bsky.social
) will present how
@motherduck.com
supports modern workloads with DuckLake. Zoom talk open to public at 4:30pm ET. YouTube video available after:
db.cs.cmu.edu/events/futur...
loading . . .
[Future Data] DuckLake: Learning from Cloud Data Warehouses to Build a Robust "Lakehouse" - Carnegie Mellon Database Group
When building scalable data systems, it is easy to focus on the... Read More +
https://db.cs.cmu.edu/events/future-data-ducklake-learning-from-cloud-data-warehouses-to-build-a-robust-lakehouse/
0
13
6
reposted by
Hannes Mühleisen
DuckDB
3 months ago
✨ We launched a new installation page for DuckDB! 🚀 The new page lets you install the latest stable DuckDB release with just one or two clicks. If the defaults don't fit your use case, no worries: alternative download methods remain available for many clients.
3
22
6
reposted by
Hannes Mühleisen
Santiago Saavedra
3 months ago
After trying
@duckdb.org
with terabytes of parquet I'm hardly going back for data exploration to anything else. Hell, I'm now spawning DuckDB for analyzing even .csv and .json files due to how ergonomic its SQL is.
2
33
4
reposted by
Hannes Mühleisen
DuckDB
3 months ago
We published a new deep dive by Laurens Kuiper, who recently redesigned DuckDB's sort. One data point: ordering the TPC-H SF100 lineitem table with the memory limit set to 30 GB is 3× faster in DuckDB v1.4 than in v1.3. Read more at
duckdb.org/2025/09/24/s...
loading . . .
Redesigning DuckDB's Sort, Again
After four years, we've decided to redesign DuckDB's sort implementation, again. In this post, we present and evaluate the new design.
https://duckdb.org/2025/09/24/sorting-again
0
32
10
reposted by
Hannes Mühleisen
DuckDB
3 months ago
🚀 We released version 0.3 of the DuckLake specification and the DuckDB ducklake extension today. It includes interoperability with Iceberg, support for geometry types and more. Check the announcement blog for more details
ducklake.select/2025/09/17/d...
0
39
12
reposted by
Hannes Mühleisen
Kyle Walker
3 months ago
This is the most exciting time ever to be working in data, and I'm not talking about AI. 3 years ago, I wrote a database-centric guide in my book for analyzing the full 92 million record 1910 Census. Now, with
#rstats
and @duckdb? Analyze those 92 million rows in seconds.
0
42
7
reposted by
Hannes Mühleisen
Marcos Huerta
3 months ago
I'm speaking soon at
#PositConf
at the 2:40PM session "Get Your Ducks in a Row with Databases" in Regency VI! My talk is "Semantic Search for the Rest of Us with DuckDB (and Llama.cpp)"
#PositConf2025
0
5
1
reposted by
Hannes Mühleisen
DuckDB
3 months ago
📈 DuckDB 1.4.0 is out! This is our first LTS release which comes with *one year of community support*. It also supports database encryption, the MERGE SQL statement and Iceberg writes. For more details, read the announcement blog post at
duckdb.org/2025/09/16/a...
0
53
25
We're testing a new distribution channel for
@duckdb.org
:
#docker
images. For now they live at `hfmuehleisen/duckdb`, feel free to test them out. And yes, hell got a little colder today.
hub.docker.com/r/hfmuehleis...
loading . . .
https://hub.docker.com/r/hfmuehleisen/duckdb
3 months ago
0
23
3
reposted by
Hannes Mühleisen
Christian Minich
4 months ago
Such a fun listen on ducklake and duckdb with
@hannes.muehleisen.org
and
@markraasveldt.bsky.social
! Learned a lot, the future of ducklake looks very bright!
overcast.fm/+AAH1YOLrL6Q
loading . . .
Duck Lake: Simplifying the Lakehouse Ecosystem — Data Engineering Podcast
https://overcast.fm/+AAH1YOLrL6Q
0
18
3
reposted by
Hannes Mühleisen
DuckDB
4 months ago
We are holding the DuckDB Amsterdam Meetup next week, featuring talks by
@rolandbouman.bsky.social
, Tania Bogatsch and
@qxip.bsky.social
:
www.meetup.com/duckdb/event...
The event is already at capacity but consider joining the wait list because there are always last-minute RSVP cancellations.
1
16
4
Excited to be a keynote speaker at PyData Amsterdam 2025 (September 24–26). My talk is titled 'Minus Three Tier: Data Architecture Turned Upside Down'. Use code PYDATADB10 for 10% off tickets
amsterdam.pydata.org/conference
#PDAmsterdam2025
#10YearsPDAmsterdam
4 months ago
0
9
3
reposted by
Hannes Mühleisen
DuckDB
4 months ago
Big Data on the Move: Can a Framework Laptop 13 ultrabook run terabyte-sized workloads with DuckDB?
@szarnyasg.org
ran the experiments and shared his finding in our latest blog post:
duckdb.org/2025/09/08/d...
loading . . .
Big Data on the Move: DuckDB on the Framework Laptop 13
We put DuckDB through its paces on a 12-core ultrabook with 128 GB RAM, running TPC-H queries up to SF10,000.
https://duckdb.org/2025/09/08/duckdb-on-the-framework-laptop-13
3
30
7
reposted by
Hannes Mühleisen
DuckDB
4 months ago
We just launched the “DuckDB in Science” site, a curated collection of papers, lectures and podcasts about DuckDB in research:
duckdb.org/science/
🎡 If you would like to learn more about DuckDB in Science, consider joining our meetup in London this Thursday:
www.meetup.com/duckdb/event...
2
45
14
reposted by
Hannes Mühleisen
DuckDB
4 months ago
🕐 🤔 Timestamps and time zones can be confusing! 😵 💡 To help you make sense of time zones in SQL, Richard Wesley wrote a short guide that covers some typical pitfalls:
duckdb.org/docs/stable/...
loading . . .
Timestamp Issues
Timestamp With Time Zone Promotion Casts Working with time zones in SQL can be quite confusing at times. For example, when filtering to a date range, one might try the following query: SET timezone = ...
https://duckdb.org/docs/stable/guides/sql_features/timestamps
1
29
3
reposted by
Hannes Mühleisen
PVLDB
5 months ago
Vol:18 No:8 → Saving Private Hash Join 👥 Authors: Laurens Kuiper, Paul Gross, Peter Boncz, Hannes Mühleisen 📄 PDF: https://www.vldb.org/pvldb/vol18/p2748-kuiper.pdf
0
14
5
reposted by
Hannes Mühleisen
DuckDB
4 months ago
New blog post by Petrica Leuca: Basic Feature Engineering with DuckDB In this post, we show how to perform essential machine learning data preprocessing tasks—like missing value imputation, categorical encoding, and feature scaling—directly in DuckDB using SQL and benchmark it against scikit-learn.
1
25
4
reposted by
Hannes Mühleisen
Mike Bostock
4 months ago
A little demo of reactive SQL in Observable Notebooks 2.0, first using (native) DuckDB to bake data from a remote source, followed by DuckDB-Wasm to create and query reactive views in the client. Should be released this week!
loading . . .
1
55
7
reposted by
Hannes Mühleisen
DuckDB
4 months ago
🎓 On September 4, we are hosting a new kind of meetup in London which will focus on the use of DuckDB in Science and Education! ⚡️ We still have some spots for lightning talks. If you're working with DuckDB in your research and/or classroom, consider sharing your story! 🔗
duckdb.org/events/2025/...
loading . . .
DuckDB Meetup on Science and Education in London
DuckDB is an in-process SQL database management system focused on analytical query processing. It is designed to be easy to install and easy to use. DuckDB has no external dependencies. DuckDB has bin...
https://duckdb.org/events/2025/09/04/duckdb-science-and-education-london-meetup/
0
13
5
reposted by
Hannes Mühleisen
4 months ago
Stretching DuckDB w/ Common Crawl, ~1.7B rows, ~300 parquet files. ~2-3s for single-column aggregations, ~2-3 mins to SUMMARIZE the data, peaking at ~12-14GB memory usage. Not exactly real-time, but the fact you can do this on a laptop with no server setups or Spark pipelines is still amazing.
1
44
10
reposted by
Hannes Mühleisen
DuckDB
4 months ago
🔥 DuckDB is featured in
@fireship.bsky.social
's “100 seconds” series: 🚀
www.youtube.com/watch?v=uHm6...
loading . . .
DuckDB in 100 Seconds
YouTube video by Fireship
https://www.youtube.com/watch?v=uHm6FEb2Re4
0
30
5
reposted by
Hannes Mühleisen
Purple Frog Systems🐸
5 months ago
Not every job needs Spark or BigQuery. Sometimes, you just need DuckDB. Find out why it’s a game-changer for local analytics 🐤 👉 Read the Frog Blog by Joe!
www.purplefrogsystems.com/2025/08/why-...
#DuckDB
#SQL
#DataEngineer
0
6
2
reposted by
Hannes Mühleisen
Julia Silge
5 months ago
I'm excited to speak this afternoon at
#useR2025
on outgrowing your laptop with
#Positron
for
#rstats
users! You can check out my slides at
juliasilge.github.io/useR-2025/
1
55
14
reposted by
Hannes Mühleisen
v
5 months ago
I went through DuckDB's WAL, and it does everything I was asking for in my blog post: 1. Per record checksum 2. Explicit error on checksum failure 3. Configurable behavior 4. Partial recovery 5. Safe truncation of the WAL only when WAL contents are checkpointed
add a skeleton here at some point
1
22
9
reposted by
Hannes Mühleisen
DuckDB
5 months ago
We just published a deep dive on spatial joins in DuckDB by
@maxxen.bsky.social
. In this blog post, Max explains how spatial joins evolved in DuckDB and how the current operator harnesses R-Trees. Read the full post at
duckdb.org/2025/08/08/s...
0
49
13
reposted by
Hannes Mühleisen
Delta Lake
5 months ago
@hannes.muehleisen.org
, Co-Creator of
@duckdb.org
, will be talking about the DuckLake project at the Open Lakehouse Meetup in Amsterdam on August 27th! Don't miss it. 🦆🦀 Sign up here ➡️
lu.ma/OLM-827
#duckdb
#opensource
#oss
#deltalake
#openlakehouse
0
9
3
Load more
feeds!
log in