Dani Solà
@dani-sola.com
📤 39
📥 63
📝 30
Interested in people, distributed systems, sustainability, and all things data.
reposted by
Dani Solà
Roland Paris
4 days ago
This is one of the sharpest analyses of international affairs that I've heard from a Canadian leader - or any national leader - in a long time. And I suspect he wrote the main bits himself.
paulwells.substack.com/p/the-carney...
loading . . .
The Carney doctrine
Open comment thread on the PM's Davos speech
https://paulwells.substack.com/p/the-carney-doctrine?utm_campaign=post&utm_medium=email&triedRedirect=true
7
170
68
reposted by
Dani Solà
Sung Kim
22 days ago
Why Nvidia bought Groq for $20 Billion?
medium.com/the-low-end-...
loading . . .
Groq’s Deterministic Architecture is Rewriting the Physics of AI Inference
How Nvidia Learned to Stop Worrying and Acquired Groq
https://medium.com/the-low-end-disruptor/groqs-deterministic-architecture-is-rewriting-the-physics-of-ai-inference-bb132675dce4
0
4
1
Highly interesting podcast on robust causal inference. Richard Hahn is an expert and a great communicator!
add a skeleton here at some point
26 days ago
0
4
1
reposted by
Dani Solà
conputer dipshit
about 1 month ago
ignore the title about caching, this is the best explanation of how LLMs work, period
loading . . .
Prompt caching: 10x cheaper LLM tokens, but how? | ngrok blog
A far more detailed explanation of prompt caching than anyone asked for.
https://ngrok.com/blog/prompt-caching/
3
192
46
reposted by
Dani Solà
Aria Desires
about 1 month ago
so pumped for the ty beta to finally be here, we did so much great work it rules!
astral.sh/blog/ty
loading . . .
ty: An extremely fast Python type checker and language server
ty is an extremely fast Python type checker and language server, written in Rust, and designed as an alternative to mypy, Pyright, and Pylance.
https://astral.sh/blog/ty
3
121
22
reposted by
Dani Solà
rmoff 🏃♂️🫖🥓
2 months ago
How to Sell Data Modeling - good stuff from
@joereis.bsky.social
practicaldatamodeling.substack.com/p/how-to-sel...
loading . . .
How to Sell Data Modeling
Making the Invisible Visible
https://practicaldatamodeling.substack.com/p/how-to-sell-data-modeling
0
6
2
reposted by
Dani Solà
Paul Hünermund
3 months ago
⏰ Last chance to register for
#CDSM2025
! Don't miss your chance to join us Nov 12–13 for two days of talks & debates at the intersection of causality, data science & AI. 💻 Online | 🎟️ Free 👉
causalscience.org
0
14
9
reposted by
Dani Solà
Gail Myerscough
5 months ago
If I’m being honest, I’m feeling pretty crap about my small business. It’s so bloody difficult at the moment with rising costs, US tariffs, Brexit nonsense and the threat of AI. Please have a look at what I do and repost to spread the word
gailmyerscough.co.uk
38
389
467
reposted by
Dani Solà
Gaël Varoquaux
5 months ago
Our didactic review on machine learning for causal inference, now open access: • identifiability (theory of when the data can answer a causal question) • machine-learning estimators • study design (asking well-framed questions + loopholes, eg with timewise data)
www.annualreviews.org/content/jour...
2
43
10
reposted by
Dani Solà
Tim Kellogg
6 months ago
Deep Agents this is a great 10 min video that’s absolutely worth your time Deep Agent = planning tool (TODO lists) + subagents + filesystem + long detailed system prompt seems like a deconstruction of why Claude Code works so well
www.youtube.com/watch?v=433S...
loading . . .
What are Deep Agents?
YouTube video by LangChain
https://www.youtube.com/watch?v=433SmtTc0TA
3
93
10
Recommended watch. Although the war is not as often in the news any more, help is as important as ever. And we can all have a direct impact on defending democracy.
add a skeleton here at some point
6 months ago
0
3
1
reposted by
Dani Solà
Simon Willison
7 months ago
I like this take by
@kentbeck.com
on how AI-assisted programming changes the balance of which skills are most important From this interview with
@gergely.pragmaticengineer.com
newsletter.pragmaticengineer.com/p/tdd-ai-age...
7
155
17
reposted by
Dani Solà
Ethan Mollick
7 months ago
No signs of an end to rapid gains in AI ability at ever-decreasing costs, yet I did my best to update my chart to take into account the price drop in o3 & new models released by Google GPT-4 was released 2.25 ago, so its worth noting the trend when considering the future of AI capabilities & cost
3
77
10
reposted by
Dani Solà
Jack Vanlightly
8 months ago
How to reliably distribute work across microservices, stream processors, durable execution, event-driven, orchestration and now AI agents? Coordinated Progress is a 4 part series that explores the common structure behind reliable distributed systems.
jack-vanlightly.com/blog/2025/6/...
loading . . .
Coordinated Progress – Part 1 – Seeing the System: The Graph — Jack Vanlightly
At some point, we’ve all sat in an architecture meeting where someone asks, “ Should this be an event? An RPC? A queue? ”, or “ How do we tie this process together across our microservices? Should it ...
https://jack-vanlightly.com/blog/2025/6/11/coordinated-progress-part-1
3
33
8
I tested ChatGPT, Claude, and Mistral on a multimodal problem about a washing machine. ChatGPT emerged the winner.
#ai
#chatgpt
#claude
#mistral
9 months ago
1
2
1
reposted by
Dani Solà
Charlie Marsh
9 months ago
Today, we’re announcing the preview release of ty, an extremely fast type checker and language server for Python, written in Rust. In early testing, it's 10x, 50x, even 100x faster than existing type checkers. (We've seen >600x speed-ups over Mypy in some real-world projects.)
14
331
98
reposted by
Dani Solà
Alexander Doria
9 months ago
New blogpost: "Training as we know it might end". It was originally a panorama of the new methods of synthetic generation but the stakes are now much higher and I openly wonder if model training is not soon going to change forever.
vintagedata.org/blog/posts/t...
4
40
11
reposted by
Dani Solà
Sebastian Röhl
9 months ago
Wow, that's an insanely cool website:
animejs.com/
loading . . .
Anime.js | JavaScript Animation Engine
A fast and versatile JavaScript animation library
https://animejs.com/
1
48
13
reposted by
Dani Solà
Rob Hyndman
10 months ago
A new Python edition of "Forecasting: Principles and Practice" is now available online at
otexts.com/fpppy/
. Thanks to
@azulgarza.bsky.social
, Cristian Challu, Max Mergenthaler, Kin Olivares & Nixtla for making this happen.
#forecasting
#python
loading . . .
Forecasting: Principles and Practice, the Pythonic Way
https://otexts.com/fpppy/
3
81
27
reposted by
Dani Solà
Joy Gao
10 months ago
Interesting read on ClickHouse’s query condition cache (not a query result cache) — efficient indices built on the fly to reduce unnecessary full table scans for repeated queries.
clickhouse.com/blog/introdu...
loading . . .
Introducing the query condition cache
Repeated queries are everywhere—in dashboards, alerts, observability, and more. Learn how ClickHouse now skips redundant work by caching filter results per granule.
https://clickhouse.com/blog/introducing-the-clickhouse-query-condition-cache
1
15
2
reposted by
Dani Solà
Ethan Mollick
10 months ago
So it looks like there's a third scaling law: you can make models better by (1) training them with more compute, by (2) having them "think" for longer about an answer, or by (now 3) generating large numbers of answers in parallel & picking good ones Both 2 & 3 seem to have lots of low-hanging fruit
3
115
14
reposted by
Dani Solà
Eric Colson
12 months ago
Operationalizing Machine Learning: An Interview Study by
@joehellerstein.bsky.social
,
@adityagp.bsky.social
, et al. Particularly love the part on "Retrofitting Explanations".
#MachineLearning
#MLOps
#Datascience
.
arxiv.org/pdf/2209.09125
1
12
6
reposted by
Dani Solà
Grace Lindsay
11 months ago
This is a pretty cool resource for applied ML: a list of "case studies" sourced from different companies describing problems they face and the methods they've tried to solve them. Anyone know of something like this specific to geospatial/remote sensing data problems?
#MLsky
#CCAI
#GISchat
loading . . .
Evidently AI - ML and LLM system design: 500 case studies
How do top companies apply AI? A database of 500 case studies from 100+ companies with practical ML use cases, LLM applications, and learnings from designing ML and LLM systems.
https://www.evidentlyai.com/ml-system-design
1
51
9
reposted by
Dani Solà
Bojan Tunguz
11 months ago
We are continuing with our series of posts on some non-trivial use cases for XGBoost. In this latest posts we talk about using Shapley *interaction* values for feature engineering. 1/2
2
8
3
Just published a post about building smart services at CLARK. A pragmatic approach that worked very well for us, going from heuristics to ML. Thoughts and feedback welcome!
#datasky
#data
#databs
medium.com/clark-engine...
loading . . .
A Blueprint for Smart Services
In today’s fast-paced world, creating intelligent services that adapt and improve over time is crucial for business success. This post…
https://medium.com/clark-engineering/a-blueprint-for-smart-services-a358bdee1054
11 months ago
1
3
0
reposted by
Dani Solà
Jess Calarco
about 1 year ago
Despite patriarchy's persistence, growing numbers of men believe they have it worse off than women. And, new research shows this "male victimhood" ideology is most common among men who aren't facing hardship. Which means what they're really feeling is status loss. 1/
www.psypost.org/male-victimh...
loading . . .
Male victimhood ideology driven by perceived status loss, not economic hardship, among Korean men
Research published in Sex Roles suggests that male victimhood ideology among South Korean men is driven more by perceived socioeconomic status decline rather than objective economic hardship.
https://www.psypost.org/male-victimhood-ideology-driven-by-perceived-status-loss-not-economic-hardship-among-korean-men/
229
7017
2296
reposted by
Dani Solà
Sung Kim
about 1 year ago
DeepSeek-R1! ⚡ Performance on par with OpenAI-o1 📖 Fully open-weight model & technical report 🏆 MIT licensed: Distill & commercialize freely! 🌐 Website & API are live now! Demo:
chat.deepseek.com
Models:
huggingface.co/deepseek-ai
2
78
27
reposted by
Dani Solà
Chris
about 1 year ago
First post of the year!
@andypavlo.bsky.social
got me thinking about why Confluent didn't build WarpStream. My conclusion: legacy infrastructure companies are going to have a tough time against cloud native, AI-enabled, post-ZIRP competitors.
loading . . .
Infrastructure Vendors Are in a Tough Spot
Cloud native, AI-enabled, post-ZIRP companies are the new apex predator.
https://materializedview.io/p/infrastructure-vendors-are-in-a-tough
4
31
8
reposted by
Dani Solà
Qian Li
about 1 year ago
The MemoryDB paper shows the power of separating responsibilities through clever composition. I think this DB frontend/execution plus a distributed transaction log pattern can be promising for creating serverless variants of many popular databases. E.g., Aurora adopts a similar decoupling approach.
1
52
11
reposted by
Dani Solà
Alex Miller
about 1 year ago
OLTP Through the Looking Glass 16 Years Later: Communication is the New Bottleneck
www.cs.cit.tum.de/fi...
0
8
3
I love dbt, but
sdf.com
looks very promising: faster runtime, improved reports, column-level lineage, etc. Does anyone have experience running it in production?
#databs
#datasky
loading . . .
SDF Labs | Data Runs Better on SDF
SDF is the next generation transformation layer and best developer platform for data. A compiler and execution engine designed to improve the data engineering experience, with compile time guarantees ...
https://www.sdf.com/
about 1 year ago
0
6
0
reposted by
Dani Solà
Nikhil Benesch
about 1 year ago
S3 (Iceberg) Tables is everything I dreamt of, and more. I blogged some long-form thoughts:
meltware.com/2024/12/04/s...
I think we're about to see an explosion of data tools (
@materialize.com
,
@clickhouse.com
,
@duckdb.org
, et al.) learn to write Iceberg tables via S3 table buckets.
#databs
loading . . .
A First Look at S3 (Iceberg) Tables
AWS announced S3 Tables today, which brings native support for Apache Iceberg to S3. It’s hard to overstate how exciting this is for the data analytics ecosystem. This post is a quick rundown of my th...
https://meltware.com/2024/12/04/s3-tables
15
107
44
reposted by
Dani Solà
Martin Kleppmann
about 1 year ago
Seems like a safe bet that object storage as a foundation of data systems architecture is here to stay
blog.colinbreck.com/predicting-t...
loading . . .
Predicting the Future of Distributed Systems
There are significant changes happening in distributed systems.
https://blog.colinbreck.com/predicting-the-future-of-distributed-systems/
12
308
48
reposted by
Dani Solà
Dr. Verónica Espinoza
about 1 year ago
📕A Portable Introduction to Data Analysis (open access) 2024. By Michael Bulmer 👉(
uq.pressbooks.pub/portable-int...
)
#Statistics
#Datavisualization
#MachineLearning
#DataScience
#Python
#rstudio
#PhD
#bioinformatics
#Rstudio
#neuroscience
#postdoc
#research
#stats
#AI
1
23
5
reposted by
Dani Solà
Jack Vanlightly
about 1 year ago
New blog post! Big data isn’t dead; it’s just going incremental. But bad things happen when uncontrolled changes collide with incremental jobs. Reacting to changes is a losing strategy.
jack-vanlightly.com/...
loading . . .
Incremental Jobs and Data Quality Are On a Collision Course - Part 1 - The Problem — Jack Vanlightly
Big data isn’t dead; it’s just going incremental If you keep an eye on the data space ecosystem like I do, then you’ll be aware of the rise of DuckDB and its message that big data is dead. The idea comes from two industry papers (and associated data sets), one from the Redshift team (paper and dataset) and one from Snowflake (paper and dataset). Each paper analyzed the queries run on their platforms, and some surprising conclusions were drawn – one being that most queries were run over quite small data. The conclusion (of DuckDB) was that big data was dead, and you could use simpler query engines rather than a data warehouse. It’s far more nuanced than that, but data shows that most queries are run over smaller datasets. Why?
https://jack-vanlightly.com/blog/2024/11/13/incremental-jobs-and-data-quality-are-on-a-collision-course-part-1-the-problem
7
25
7
you reached the end!!
feeds!
log in