Micah Wylde
@micahw.com
📤 444
📥 112
📝 54
Co-founder arroyo.dev, building next-gen streaming systems. Prev Splunk, Lyft, Sift, Quantcast.
reposted by
Micah Wylde
Jeremy Morrell
3 months ago
Another brand new new feature is the R2 data catalog:
blog.cloudflare.com/cloudflare-d...
Build something with Pipelines and R2 SQL. I suggest receiving OpenTelemetry data and then surfacing that in a web app (logs should be fairly straightforward), but there are tons of uses for this.
loading . . .
Announcing the Cloudflare Data Platform: ingest, store, and query your data directly on Cloudflare
The Cloudflare Data Platform, launching today, is a fully-managed suite of products for ingesting, transforming, storing, and querying analytical data, built on Apache Iceberg and R2 storage.
https://blog.cloudflare.com/cloudflare-data-platform/
1
1
2
reposted by
Micah Wylde
Jeremy Morrell
3 months ago
It's early, but I'm excited about direction that the Cloudflare Data Platform is taking. Trying to set up similar pipelines on other clouds would typically be $$$ and take tons of expertise. Managing kafka and multiple services for ingestion, compaction, etc
blog.cloudflare.com/cloudflare-d...
loading . . .
Announcing the Cloudflare Data Platform: ingest, store, and query your data directly on Cloudflare
The Cloudflare Data Platform, launching today, is a fully-managed suite of products for ingesting, transforming, storing, and querying analytical data, built on Apache Iceberg and R2 storage.
https://blog.cloudflare.com/cloudflare-data-platform/
3
8
1
The news is finally out! Cloudflare has a Data Platform! We're starting with serverless streaming pipelines (powered by arroyo), a managed Iceberg Catalog, and a new distributed SQL engine built on top of DataFusion
add a skeleton here at some point
3 months ago
0
12
0
reposted by
Micah Wylde
Andrew Lamb
7 months ago
Reminder: San Francisco @ApacheDataFusio meetup tomorrow:
lu.ma/uuxd443e
loading . . .
SF Apache DataFusion Meetup · Luma
Join us for an evening of learning, networking, and diving into Apache DataFusion, the blazing-fast query execution framework for Rust-based data…
https://lu.ma/uuxd443e
0
3
1
reposted by
Micah Wylde
Cloudflare
7 months ago
Cloudflare is at Snowflake Summit in San Francisco this week! Swing by our booth 2605 to chat about the new Cloudflare R2 Data Catalog and how it can make your data management and analytics easier!
0
8
2
Next Monday after the Snowflake Summit keynote! Hang out on our beautiful roof with other cool data folks, and hear some great speakers from LanceDB,
@mooncakelabs.bsky.social
, Eventual, Marimo, Bobsled, and
@cloudflare-dev.bsky.social
!
lu.ma/dbq1hfij
loading . . .
Modern Data w/ Cloudflare + Friends · Luma
Come talk about modern data formats, streaming ingestion, query engines and how you feel about Iceberg at Cloudflare's HQ. We'll be running a series of…
https://lu.ma/dbq1hfij
7 months ago
0
1
0
reposted by
Micah Wylde
Chris
8 months ago
Ok, y'all. This took me several weeks and a ton of help from
@frankmcsherry.bsky.social
and
@lalithsuresh.bsky.social
. I dug into timely dataflow, differential dataflow, and DBSP to get you up to speed on IVM engines and materialized views. Enjoy!
loading . . .
Everything You Need to Know About Incremental View Maintenance
An overview of incremental view maintenance, why it’s useful, and how you can implement it.
https://materializedview.io/p/everything-to-know-incremental-view-maintenance
4
76
21
I’m only a week into life at
@cloudflare-dev.bsky.social
but already amazed by how much of Cloudflare is built _on_ Cloudflare. I’d never have guessed you could get so far with just workers + durable objects!
8 months ago
0
1
0
Arroyo is joining
@cloudflare.social
! We're bringing Arroyo to the Developer Platform as a serverless stream processing system, and will also remain open-source and self-hostable.
www.arroyo.dev/blog/arroyo-...
loading . . .
Arroyo is joining Cloudflare
Arroyo has been acquired by Cloudflare to bring serverless SQL stream processing to the Cloudflare Developer Platfrorm, integrated with Queues, Workers, and R2. The Arroyo Engine will remain open-sour...
https://www.arroyo.dev/blog/arroyo-is-joining-cloudflare
9 months ago
2
18
4
reposted by
Micah Wylde
rmoff 🏃♂️🫖🥓
9 months ago
Couple of big announcements from
@cloudflare.social
today for folk in
#dataBS
: * Acquisition of Arroyo, launch of Pipelines for streaming ingestion:
blog.cloudflare.com/cloudflare-a...
* Launch of R2 Data Catalog—a managed Apache Iceberg catalog for R2
blog.cloudflare.com/r2-data-cata...
loading . . .
Just landed: streaming ingestion on Cloudflare with Arroyo and Pipelines
We’ve just shipped our new streaming ingestion service, Pipelines — and we’ve acquired Arroyo, enabling us to bring new SQL-based, stateful transformations to Pipelines and R2.
https://blog.cloudflare.com/cloudflare-acquires-arroyo-pipelines-streaming-ingestion-beta/
0
9
3
Arroyo 0.14.0 is now available, including new lookup joins, support for nested updating aggregates, struct types, new syntax, and a bunch of improvements and fixes:
www.arroyo.dev/blog/arroyo-...
loading . . .
Announcing Arroyo 0.14.0
Arroyo 0.14 is now available! This release introduces support for lookup joins, more powerful updating SQL, new syntax, structs in DDL, and more!
https://www.arroyo.dev/blog/arroyo-0-14-0
9 months ago
0
1
0
I know by month 2 we're all inured to this stuff, but this is a beyond crazy mix of incompetence and illegality
www.theatlantic.com/politics/arc...
loading . . .
The Trump Administration Accidentally Texted Me Its War Plans
U.S. national-security leaders included me in a group chat about upcoming military strikes in Yemen. I didn’t think it could be real. Then the bombs started falling.
https://www.theatlantic.com/politics/archive/2025/03/trump-administration-accidentally-texted-me-its-war-plans/682151/
9 months ago
0
0
0
Arroyo is sitting at 3,999 stars... who's going to put us over the top
github.com/ArroyoSystem...
10 months ago
1
3
0
You'd think that the key to being a fast streaming engine is like clever join algorithms, but it's mostly just being really good at JSON. Arroyo uses Arrow and the arrow-rs JSON decoder along with some streaming extensions. I think it's pretty cool, so I wrote up a long explanation of how it works
loading . . .
Fast columnar JSON decoding with arrow-rs
JSON is the most common serialization format used in streaming pipelines, so it pays to be able to deserialize it fast. This post covers in detail how the arrow-json library works to perform very effi...
https://www.arroyo.dev/blog/fast-arrow-json-decoding
10 months ago
0
14
1
Our team at Arroyo recently needed to rebuild our (very ad-hoc) analytics infra to account for our growth. We spent some time working out the best way to set up a near-real-time data lake today, and ended up with a pretty sweet approach we're calling the LOAD stack:
www.arroyo.dev/blog/buildin...
loading . . .
Building a near-real-time data lake with the LOAD stack
The LOAD stack (log storage/object storage/Arroyo/DuckDB) makes it easy to build an affordable real-time data lake with minimal operational overhead. This tutorial will guide you through the process o...
https://www.arroyo.dev/blog/building-a-real-time-data-lake
11 months ago
1
7
2
Arroyo 0.13.0 is now available! This one includes some big improvements to the core engine a (including the operator chaining work I wrote about previously:
bsky.app/profile/mica...
) and a bunch of other features. All the details on our blog:
www.arroyo.dev/blog/arroyo-...
add a skeleton here at some point
about 1 year ago
1
4
0
Is
#DataBS
interested in the internals of streaming engines? The next release of
arroyo.dev
(0.13) has a new feature in the core dataflow called operator chaining which gets at some of the interesting details of how these systems work. So let’s dive in to streaming dataflow 🧵
loading . . .
Arroyo — Cloud-native stream processing
Arroyo is the easiest way to run SQL queries against your streaming data
https://arroyo.dev
about 1 year ago
1
37
7
nothing like a potential natural disaster to bring us all together here 🤗
about 1 year ago
0
0
0
Sad to see Redis Labs burning whatever shreds of credibility they still had with the open source community. Making money as an open source co is hard but there has to be better ways than this
github.com/redis-rs/red...
loading . . .
Future Crate Maintenance and Redis Inc. Relationship · Issue #1419 · redis-rs/redis-rs
Hello users. I haven't actively maintained this library in a very long time as you probably noticed. I am still controlling the entry on crates.io for it alongside the redis release team and @badbo...
https://github.com/redis-rs/redis-rs/issues/1419
about 1 year ago
0
3
0
Happy new Jepsen Report Day to all who celebrate!
jepsen.io/analyses/buf...
Confirms my priors that almost no one should be directly calling the Kafka client today—use a stream processing engine for data use cases or a durable execution engine for applications.
loading . . .
Jepsen: Bufstream 0.1.0
https://jepsen.io/analyses/bufstream-0.1.0
about 1 year ago
1
4
0
If you missed
#p99conf
last week, talks are now available to stream on YouTube. I spoke about the design decisions that went into Arroyo's incredible performance:
youtube.com/watch?v=7H4C...
Come for the Rust hot takes, stay for my terrible hand-drawn architecture diagrams 😅
loading . . .
YouTube
Share your videos with friends, family, and the world
https://youtube.com/watch?v=7H4CIKZ4YJg
about 1 year ago
0
2
2
Guess we're all here now 👋
about 1 year ago
1
12
1
you reached the end!!
feeds!
log in