Jake Thomas
@jakthom.bsky.social
📤 3390
📥 1520
📝 341
Sometimes data systems, sometimes security, sometimes ai/ml, sometimes a blend of it all.
bye Kafka 👋
about 1 month ago
1
5
0
My father just proved he was right and I was wrong. Using AI. Well done 🫡
about 1 month ago
0
1
0
Engineering is fun. That is all.
2 months ago
0
5
0
(for those of us who like 🎧🎶 and 🤖...)
open.spotify.com/episode/1n9V...
loading . . .
F3: The Open-Source Data File Format for the Future
https://open.spotify.com/episode/1n9VuXUtwCiP5uErXzaK34?si=10ce6bf50153444c
3 months ago
1
3
0
datafusion.apache.org/blog/2025/08...
loading . . .
Using External Indexes, Metadata Stores, Catalogs and Caches to Accelerate Queries on Apache Parquet - Apache DataFusion Blog
https://datafusion.apache.org/blog/2025/08/15/external-parquet-indexes/
5 months ago
0
2
0
bla...bla bla...bla bla... build.
6 months ago
1
4
0
so good
byteofdev.com/posts/making...
loading . . .
Making Postgres 42,000x slower because I am unemployed
As an respectable unemployed person must do, I tried to make Postgres as slow as possible
https://byteofdev.com/posts/making-postgres-slow/
6 months ago
0
7
2
Sufficiently-advanced software deployments are indistinguishable from magic.
6 months ago
0
10
0
😍
blog.cloudflare.com/logexplorer-...
loading . . .
Cloudflare Log Explorer is now GA, providing native observability and forensics
We are happy to announce the General Availability of Cloudflare Log Explorer, a powerful product designed to bring observability and forensics capabilities directly into your Cloudflare dashboard.
https://blog.cloudflare.com/logexplorer-ga/
7 months ago
0
2
0
"lakebase".... ..... ... .. . 🤣🤣🤣🤣🤣🤣🤣🤣
7 months ago
2
4
0
So.... what exactly happens when databases are commoditized?..
7 months ago
2
0
0
So good.
query.farm/duckdb_exten...
7 months ago
0
3
0
❤️
add a skeleton here at some point
8 months ago
0
1
0
🎯🎯🎯🎯
add a skeleton here at some point
8 months ago
0
3
0
reposted by
Jake Thomas
Tobias Müller
8 months ago
Welcome to the age of $10/month Lakehouses! How to build and run a Lakehouse on top of
@cloudflare.social
R2 , Cloudflare Containers and Neon Postgres, all backed by the new DuckLake "SQL as Lakehouse" format, via
@duckdb.org
.
tobilg.com/the-age-of-1...
loading . . .
Welcome to the age of $10/month Lakehouses
No, this article is not about buying properties close to lakes...
https://tobilg.com/the-age-of-10-dollar-a-month-lakehouses
4
45
11
ducklake.select
is neat. But I'd put it into prod tomorrow if it was called Drake.
loading . . .
DuckLake is an integrated data lake and catalog format.
DuckLake delivers advanced data lake features without traditional lakehouse complexity by using Parquet files and your SQL database. It's an open, standalone format from the DuckDB team.
https://ducklake.select/
8 months ago
1
6
0
reposted by
Jake Thomas
Tanel Poder
9 months ago
0x.tools
xCapture v3: Linux Performance Analysis with Modern eBPF and DuckDB 👀👀
tanelpoder.com/posts/xcaptu...
1
29
12
@jackhcable.bsky.social
discussing product security in the AI era 🔥
9 months ago
0
2
0
@ethanrosenthal.com
speaking 🔥
9 months ago
0
4
1
I've been exploring how to do Iceberg with the fewest dependencies possible. tl;dr ->
@duckdb.org
,
@amazonwebservices.bsky.social
S3 Tables, and a sprinkle of Python is awesome.
jakthom.dev/blog/zero-in...
loading . . .
Creating a Zero-Infrastructure Iceberg Data Lake in 5 Minutes
Zero-Infrastructure Iceberg in 5 Minutes
https://jakthom.dev/blog/zero-infrastructure-iceberg-data-lake-with-s3-tables-and-duckdb/
9 months ago
3
44
8
idc what anyone says, I love deploying static websites by hand
9 months ago
0
5
0
reposted by
Jake Thomas
Micah Wylde
9 months ago
Arroyo is joining
@cloudflare.social
! We're bringing Arroyo to the Developer Platform as a serverless stream processing system, and will also remain open-source and self-hostable.
www.arroyo.dev/blog/arroyo-...
loading . . .
Arroyo is joining Cloudflare
Arroyo has been acquired by Cloudflare to bring serverless SQL stream processing to the Cloudflare Developer Platfrorm, integrated with Queues, Workers, and R2. The Arroyo Engine will remain open-sour...
https://www.arroyo.dev/blog/arroyo-is-joining-cloudflare
2
18
4
Zero egress costs will be (are?) the new data gravity. H/t
@eastdakota.com
open.spotify.com/episode/6lUL...
loading . . .
How Cloudflare is Working to Fix the Internet with Matthew Prince
Screaming in the Cloud · Episode
https://open.spotify.com/episode/6lULQTEgKGkvxBjCGS5Cg2?si=YiHdi5SAQciwwQ34QNL94A
9 months ago
0
6
0
Aaaand 4/10 is quite the day in the world of data systems....
datafusion.apache.org/blog/2025/04...
loading . . .
tpchgen-rs World’s fastest open source TPC-H data generator, written in Rust - Apache DataFusion Blog
https://datafusion.apache.org/blog/2025/04/10/fastest-tpch-generator/
9 months ago
0
0
0
@cloudflare.social
we ♥️ u
blog.cloudflare.com/r2-data-cata...
add a skeleton here at some point
9 months ago
0
7
0
So....has anyone built an MCP
@duckdb.org
extension yet?...
9 months ago
3
4
0
reposted by
Jake Thomas
Julien Le Dem
9 months ago
I'm looking forward to see you in person at the Iceberg summit in SF tomorrow. I'll be speaking about the evolution of data storage from Hadoop to Iceberg and how we're witnessing the Advent of The Open Data Lake.
www.icebergsummit2025.com
0
11
1
arrow is all you need
10 months ago
1
5
0
🔥
planetscale.com/blog/io-devi...
loading . . .
IO devices and latency — PlanetScale
Take an interactive journey through the history of IO devices, and learn how IO device latency affects performance.
https://planetscale.com/blog/io-devices-and-latency
10 months ago
0
13
1
reposted by
Jake Thomas
Tobias Müller
10 months ago
I wrote a quick blog post on how to setup and use Amazon S3 Tables with
@duckdb.org
, based on its new Iceberg capabilities:
tobilg.com/query-s3-tab...
loading . . .
Query S3 Tables with DuckDB
DuckDB has gained a new feature in preview, that allows querying of Iceberg data in AWS S3 Tables. Setting up a S3 Table There are multiple steps which need to be performed to set up a S3 Table that can be then queried with tools like DuckDB. As the ...
https://tobilg.com/query-s3-tables-with-duckdb
1
17
4
😂
www.youtube.com/watch?v=3JW7...
loading . . .
I replaced my entire tech stack with Postgres...
YouTube video by Fireship
https://www.youtube.com/watch?v=3JW732GrMdg
10 months ago
1
3
0
My favorite llm hack continues to be "...and explain it like I'm 10"
11 months ago
0
2
0
Big data: unnecessarily scanning 100gb to "limit 50"
11 months ago
0
13
0
quack
12 months ago
0
5
0
reposted by
Jake Thomas
Daniel ten Wolde
12 months ago
🚀 Today at
#DuckCon
, I’ll give a lightning talk on graph analytics in
@duckdb.org
using SQL/PGQ & the DuckPGQ extension! In just 5 mins, get up to speed on the new syntax & running graph queries inside DuckDB. 🎥 Live stream (3 PM CET):
www.youtube.com/@duckdb
📅 Program:
duckdb.org/events/2025/...
loading . . .
DuckDB
https://www.youtube.com/@duckdb
2
21
3
reposted by
Jake Thomas
Carlo Piovesan
12 months ago
DuckCon #6 a few hours away, Pakhuis De Zwijger or live on the
@duckdb.org
Youtube channel.
duckdb.org/events/2025/...
Speaker are impressive, looking forward to hear from them, chance to see in person a bunch of GitHub/Discord handles / talk DuckDB a bunch. I will be around, come to say quack!
1
16
6
reposted by
Jake Thomas
Mathias Lafeldt
12 months ago
Zig's concurrent/cached builds are sooo fast that I changed the default behavior of zig build to compile the extension for *all* supported DuckDB versions & platforms. Simplifies and speeds up CI. Also, the output dir (zig-out) is now a ready-to-use extension repository!
github.com/mlafeldt/qua...
loading . . .
Multi-version builds by mlafeldt · Pull Request #1 · mlafeldt/quack-zig
Build and test the extension for all DuckDB versions by default, leveraging Zig's concurrent/cached builds. This simplifies CI and turns zig-out into a ready-to-use extension repository. Uncach...
https://github.com/mlafeldt/quack-zig/pull/1
1
9
4
The first time I used Redshift was with
@jamesdensmore.bsky.social
, as its OLAP perf was wild compared to our Postgres analytics db. Snowflake then fixed of Redshift problems like WLM, workload isolation, upgrades, etc But we prob would not have used Redshift if DuckDB-in-PG had existed...
12 months ago
2
7
0
welcome
@beardbrewery.bsky.social
👋
12 months ago
0
2
0
Yep
add a skeleton here at some point
12 months ago
1
3
0
Dear Santa: Single-node, Kafka api-compat thing that shoves data into parquet. ♥️, me
12 months ago
5
30
3
♥️
add a skeleton here at some point
12 months ago
0
3
0
No git repos are the next best thing but we're probably stuck with those for a while.
github.com/jakthom/nodb
#databs
add a skeleton here at some point
12 months ago
0
6
1
The best code is no code at all. The best database is no database (infrastructure) at all.
#databs
12 months ago
5
22
3
reposted by
Jake Thomas
Hannes Mühleisen
12 months ago
I totally missed that hell froze over: Arm runners are finally available on the free tier of
@github.com
actions! I repeat, ARM RUNNERS! `runs-on: ubuntu-22.04-arm`
#databs
#yamlgames
3
43
8
Jeep waving from my Prius is the most fun
12 months ago
1
0
0
reposted by
Jake Thomas
Pedro Alcocer
about 1 year ago
me, a duckdb user, whispering in the ear of a summer research assistant at my professor father's villa in northern italy who i've quickly fallen in love with; the late afternoon sun warming the ancient stone walls as cicadas buzz in the apricot trees: "union all by name"
#databs
0
17
1
🔥
add a skeleton here at some point
about 1 year ago
0
5
0
reposted by
Jake Thomas
Tobias Müller
about 1 year ago
OPFS support has landed in
@duckdb.org
WASM! That means you can now persist data in the browser, which opens a wide range of use cases…
github.com/duckdb/duckd...
loading . . .
Add OPFS (Origin Private File System) Support by e1arikawa · Pull Request #1856 · duckdb/duckdb-wasm
Description: This PR implements OPFS (Origin Private File System) support in the latest version of duckdb-wasm based on PR #1490. This allows database files to be read and written to the OPFS. API:...
https://github.com/duckdb/duckdb-wasm/pull/1856
1
34
6
reposted by
Jake Thomas
Mimoune Djouallah
about 1 year ago
Fivetran casually adding support for attaching an iceberg catalog to
#duckdb
, nice !!!
github.com/duckdb/duckd...
loading . . .
Add Iceberg catalog support by ediril · Pull Request #95 · duckdb/duckdb-iceberg
This PR adds support for attaching an Iceberg catalog and be able to read iceberg tables from a datalake
https://github.com/duckdb/duckdb-iceberg/pull/95
1
20
3
Load more
feeds!
log in