Georg Heiler
@geoheil.com
š¤ 255
š„ 1022
š 131
building socio-technical complex systems with data | geoheil.com
pinned post!
@milicevica23.bsky.social
and I recently gave a talk how we scale
#data
#pipelinese
for Telekom
georgheiler.com/event/magent...
loading . . .
Scaling data pipelines @Telekom | Georg Heiler
Tackling data challenges via the orchestrator.
https://georgheiler.com/event/magenta-data-architecture-25/
6 months ago
3
1
1
#duckdb
#ducklake
#cmu
www.youtube.com/watch?v=z2Gh...
loading . . .
DuckLake: Learning from Cloud Data Warehouses to Build a Robust āLakehouseā (Jordan Tigani)
YouTube video by CMU Database Group
https://www.youtube.com/watch?v=z2GhznqtIz0
about 1 hour ago
0
1
0
Something about super and computing in the making anyone daring out there who wants to explore? Or folks who want to exchange ideas about SLURM, HET jobs and advanced resource management?
github.com/ascii-supply...
loading . . .
feat: build SLURM integration for dagster by HPicatto Ā· Pull Request #19 Ā· ascii-supply-networks/dagster-slurm
Type of Change feat: New feature fix: Bug fix docs: Documentation style: Code style refactor: Code refactor perf: Performance improvement test: Tests chore: Maintenance Description adds ...
https://github.com/ascii-supply-networks/dagster-slurm/pull/19
about 6 hours ago
0
0
0
Simple Sovereign Scalable Data Stack
georgheiler.com/event/tdwi-2...
precursor:
pypi.org/project/dags...
github.com/dagster-io/c...
if you want to see this in action join in Nürnberg or Vienna for some sovereign, scalable data talks in the coming weeks
loading . . .
Simple Sovereign Scalable Data Stack | Georg Heiler
Tired of cloud lock-in and surprise bills? This talk shows how to build a fast, portable analytics stack around DuckDB and Dagster. Along the way of our journey to sovereignty and scale we touch onā¦
https://georgheiler.com/event/tdwi-25-roundtable-nuernberg/
about 13 hours ago
2
5
0
#duckdb
#multimodal
#rag
www.youtube.com/watch?v=2qSZ...
blobs.duckdb.org/events/duckd...
loading . . .
When the duck quacks: Multimodal querying with FlockMTL
YouTube video by DuckDB
https://www.youtube.com/watch?v=2qSZzydUT7g
7 days ago
0
3
0
#compliance
#anonymization
#python
www.youtube.com/watch?v=EqQd...
loading . . .
Katharine Jarmul - Anonymization: Why is it so hard? (PyData Prague #27)
YouTube video by PyData
https://www.youtube.com/watch?v=EqQdaHSBbY8
12 days ago
0
0
0
#gis
#medium-data
#sedona
#rust
#datafusion
sedona.apache.org/latest/blog/...
loading . . .
Introducing SedonaDB: A single-node analytical database engine with geospatial as a first-class citizen - Apache Sedona
Apache Sedona is a cluster computing system for processing large-scale spatial data. Sedona extends existing cluster computing systems, such as Apache Spark, Apache Flink, and Snowflake, with a set of...
https://sedona.apache.org/latest/blog/2025/09/24/introducing-sedonadb-a-single-node-analytical-database-engine-with-geospatial-as-a-first-class-citizen/
13 days ago
0
0
0
reposted by
Georg Heiler
DuckDB
21 days ago
š DuckDB 1.4.0 is out! This is our first LTS release which comes with *one year of community support*. It also supports database encryption, the MERGE SQL statement and Iceberg writes. For more details, read the announcement blog post at
duckdb.org/2025/09/16/a...
0
54
25
A living Elo leaderboard for analytics/OLAP engines. Public benchmarks (TPC-DS/H, SSB, vendor & community posts) becomes a āmatch.ā Upsets + context matter. Browse the board & poke holes:
rebrand.ly/ey6y7hf
loading . . .
Home | Data inconsistencies
Data inconsistencies, architecuture and real world stories
https://data-inconsistencies.datajourney.expert/
about 1 month ago
0
0
0
#owasp
now gearing up for
#llm
and
#genai
- Multi-Agentic system Threat Modeling Guide v1.0
genai.owasp.org/resource/mul...
loading . . .
Multi-Agentic system Threat Modeling Guide v1.0
This guide builds on the OWASP Agentic AI ā Threats and Mitigations publication, our master agentic threat taxonomy, by applying its threat taxonomy to real-world multi-agent systems (MAS). Theseā¦
https://genai.owasp.org/resource/multi-agentic-system-threat-modeling-guide-v1-0/
about 2 months ago
0
0
0
cool - feature server goes
#duckdb
#gis
github.com/tobilg/duckd...
loading . . .
GitHub - tobilg/duckdb_featureserv: A lightweight RESTful geospatial feature server for DuckDB with duckdb-spatial support
A lightweight RESTful geospatial feature server for DuckDB with duckdb-spatial support - tobilg/duckdb_featureserv
https://github.com/tobilg/duckdb_featureserv
about 2 months ago
0
4
1
kartproject.org
#vcs
#geospatial
loading . . .
Kart
Distributed version-control for geospatial and tabular data.
https://kartproject.org/
2 months ago
0
0
0
#ai
#small
#models
www.turing.ac.uk/blog/why-we-...
loading . . .
Why we still need small language models ā even in the age of frontier AI
Lean, locally run models can unlock huge benefits for public sector and compute-constrained environments
https://www.turing.ac.uk/blog/why-we-still-need-small-language-models-even-age-frontier-ai?utm_source=LinkedIn&utm_medium=Text_link&utm_campaign=Turing-Blog_Why-we-still-need-small-language-models
2 months ago
0
0
1
@prefix.dev
awesome to see dependency overrides merged
github.com/prefix-dev/p...
loading . . .
feat: pypi-options.dependency-overrides by HernandoR Ā· Pull Request #3948 Ā· prefix-dev/pixi
add dependency-overrides for overriding. only updated manifest for now related #3917 #3890
https://github.com/prefix-dev/pixi/pull/3948
3 months ago
1
0
0
Together with
www.linkedin.com/in/aaron-cul...
I have created a template for making
#LLMs
(from different vendors and even self hosted ones) easily accessible to researchers - including advanced document RAG with
#docling
.
github.com/complexity-s...
loading . . .
GitHub - complexity-science-hub/llm-in-a-box-template: Template to use genai for chatting and via api to accelerate research
Template to use genai for chatting and via api to accelerate research - complexity-science-hub/llm-in-a-box-template
https://github.com/complexity-science-hub/llm-in-a-box-template
3 months ago
1
1
1
#fern
#hacker
#bears
#apt
www.youtube.com/watch?v=ZhfI...
loading . . .
The Hunt for the World's Most Dangerous Hackers
YouTube video by fern
https://www.youtube.com/watch?v=ZhfI0EboPU0
3 months ago
0
1
0
reposted by
Georg Heiler
Martin Kleppmann
3 months ago
People say āyour imagination is the limitā to mean the possibilities are limitless, but I believe that for many people the phrase is more literally true: they really are limited by a lack of imagination more than anything else
2
25
1
#state
of
#cyber
for
#germany
www.heise.de/news/Bundesr...
quite sad to read that emergency power supplies are not there for a large quantity of the data centers
loading . . .
Sogar Notstrom fehlt: Schlechte Sicherheitstandards in Rechenzentren des Bundes
Ein Bericht des Bundesrechnungshofs wirft kein gutes Licht auf die Sicherheit der IT des Bundes. Nur ein Bruchteil der Rechenzentren erreiche Mindeststandards.
https://www.heise.de/news/Bundesrechnungshof-Sicherheitsniveau-der-Bundes-IT-unzureichend-10475018.html
3 months ago
0
0
0
nice to see DE tooling adapted in other verticals -
github.com/michimussato...
here for visualization rendering
4 months ago
0
1
0
reposted by
Georg Heiler
Simon Willison
4 months ago
I particularly appreciated principle 3: Agent actions and planning must be observable
1
40
4
reposted by
Georg Heiler
Martin Kleppmann
4 months ago
Congratulations
@lambda.bsky.social
! Today
@theguardian.com
is launching a new way for whistleblowers to anonymously contact journalists, based on years-long research by Daniel and other colleagues.
www.theguardian.com/gnm-press-of...
loading . . .
The Guardian launches Secure Messaging, a world-first from a media organisation, in collaboration with the University of Cambridge
Secure Messaging is a new innovation for confidential story-sharing and source protection, underpinning the Guardianās commitment to investigative journalism. The Guardian has published the open sourc...
https://www.theguardian.com/gnm-press-office/2025/jun/09/the-guardian-launches-secure-messaging-a-world-first-from-a-media-organisation-in-collaboration-with-the-university-of-cambridge
1
399
170
#dagster
journey for refining the documentation explained by
@pedramnavid.com
www.youtube.com/watch?v=dVB5...
#datacouncil
loading . . .
Write Less More: How Dagster Rebuilt Our Docs From the Ground Up
YouTube video by Data Council
https://www.youtube.com/watch?v=dVB5hLPyHCM
4 months ago
1
1
0
story of
#duckdb
www.youtube.com/watch?v=o53o...
@hannes.muehleisen.org
awesome talk!
loading . . .
Liberate Analytical Data Management with DuckDB
YouTube video by Data Council
https://www.youtube.com/watch?v=o53onmgnQDU
4 months ago
1
10
0
#malloydata
#datacouncil
#sql
#semanticlayer
www.youtube.com/watch?v=DZkX...
loading . . .
Building Blocks: Advanced Semantic Data Model Layering
YouTube video by Data Council
https://www.youtube.com/watch?v=DZkXvdzYlVs
4 months ago
0
3
0
#influxdb
#datafusion
#datacouncil
www.youtube.com/watch?v=8A4R...
loading . . .
Building InfluxDB 3 Core: Real-Time Columnar DB & Data Processor On Object Storage
YouTube video by Data Council
https://www.youtube.com/watch?v=8A4RO3ruKfU
4 months ago
0
1
0
#darwin
#gƶdel
#ai
x.com/adcock_brett...
loading . . .
Brett Adcock on X: "Japan's Sakana AI dropped Darwin Gƶdel Machine, a self-improving agent that can modify its code to boost performance On SWE-bench, DGM improved its performance from 20.0% to 50.0%, while on Polyglot, it increased its success rate from 14.2% to 30.7% https://t.co/Y7IZVjTQZa https://t.co/axNrZZvt7M" / X
Japan's Sakana AI dropped Darwin Gƶdel Machine, a self-improving agent that can modify its code to boost performance On SWE-bench, DGM improved its performance from 20.0% to 50.0%, while on Polyglot, it increased its success rate from 14.2% to 30.7% https://t.co/Y7IZVjTQZa https://t.co/axNrZZvt7M
https://x.com/adcock_brett/status/1929207216910790946
4 months ago
0
1
1
#ducklake
#duckdb
www.youtube.com/watch?v=zeon...
I think if a ecosystem and more iceberg interoperability especially with different engines is available some day this will be an an awesome approach.
loading . . .
Introducing DuckLake
YouTube video by DuckDB
https://www.youtube.com/watch?v=zeonmOO9jm4
4 months ago
0
2
0
#ai
#embeddings
#privacy
What are the implications of embedding
#inversion
?
x.com/jxmnop/statu...
loading . . .
jack morris on X: "excited to finally share on arxiv what we've known for a while now: All Embedding Models Learn The Same Thing embeddings from different models are SO similar that we can map between them based on structure alone. without *any* paired data feels like magic, but it's real:š§µ" / X
excited to finally share on arxiv what we've known for a while now: All Embedding Models Learn The Same Thing embeddings from different models are SO similar that we can map between them based on structure alone. without *any* paired data feels like magic, but it's real:š§µ
https://x.com/jxmnop/status/1925224612872233081
5 months ago
0
2
0
#cyber
#security
#rap
www.youtube.com/watch?v=6Hcs...
#gangsta
loading . . .
Cyber Gangstaās Paradise | Prof. Merli ft. MC BlackHat [Parody Music Video]
YouTube video by Dominik Merli | Professor für IT-Sicherheit
https://www.youtube.com/watch?v=6Hcs78NTqpE
5 months ago
0
0
0
cool
#rust
project - automatic grammar fuzzer
github.com/R9295/autarkie
loading . . .
GitHub - R9295/autarkie: Autarkie - Instant Grammar Fuzzing Using Rust Macros
Autarkie - Instant Grammar Fuzzing Using Rust Macros - R9295/autarkie
https://github.com/R9295/autarkie
5 months ago
0
1
0
#blackhat
www.youtube.com/watch?v=84NV...
loading . . .
SpAIware & More: Advanced Prompt Injection Exploits in LLM Applications
YouTube video by Black Hat
https://www.youtube.com/watch?v=84NVG1c5LRI
5 months ago
0
1
0
#fpga
#python
executing python code directly from hardware
www.runpyxl.com/gpio
loading . . .
Python sub-micro GPIO ā PyXL Benchmark
PyXL runs Python directly in hardware. GPIO toggle in sub-micro. See how it works.
https://www.runpyxl.com/gpio
5 months ago
0
1
0
#xroad
#xtee
truly impressive data exchange backbone
github.com/nordic-insti...
loading . . .
GitHub - nordic-institute/X-Road: Source code of the X-RoadĀ® data exchange layer software
Source code of the X-RoadĀ® data exchange layer software - nordic-institute/X-Road
https://github.com/nordic-institute/X-Road
5 months ago
1
0
0
#horseless
#carriages
of
#ai
koomen.dev/essays/horse...
do we use the right
#abstractions
?
loading . . .
AI Horseless Carriages | koomen.dev
An essay about bad AI app design
https://koomen.dev/essays/horseless-carriages/
6 months ago
1
1
0
reposted by
Georg Heiler
Martin Kleppmann
6 months ago
Incremental view maintenance is how to make the idea of "turning the database inside-out" actually happen, and I'm excited to see more interest in the area
add a skeleton here at some point
0
58
4
luminousmen.com/post/data-en...
loading . . .
Data Engineering: Now with 30% More Bullshit
Tools don't solve problems. People do. No buzzword replaces craftsmanship.
https://luminousmen.com/post/data-engineering-now-with-30-more-bullshit
6 months ago
0
0
0
@milicevica23.bsky.social
and I recently gave a talk how we scale
#data
#pipelinese
for Telekom
georgheiler.com/event/magent...
loading . . .
Scaling data pipelines @Telekom | Georg Heiler
Tackling data challenges via the orchestrator.
https://georgheiler.com/event/magenta-data-architecture-25/
6 months ago
3
1
1
#postgres
#search
blog.vectorchord.ai/postgresql-f...
loading . . .
PostgreSQL BM25 Full-Text Search: Speed Up Performance with These Tips
Boost PostgreSQL full-text search speed by 50x with simple optimizations. Use VectorChord-BM25 to accelerate and better BM25 ranking in postgres.
https://blog.vectorchord.ai/postgresql-full-text-search-fast-when-done-right-debunking-the-slow-myth
6 months ago
0
1
0
#ct3003
#ai
www.youtube.com/watch?v=YI5Q...
loading . . .
Was ich mit KI mache
YouTube video by c't 3003
https://www.youtube.com/watch?v=YI5QsBWQQZI
6 months ago
1
0
0
#rust
and
#data
github.com/rewrite-bigd...
make it faster
loading . . .
GitHub - rewrite-bigdata-in-rust/RBIR: A collection of RBIR projects and posts for anyone interested in joining this journey.
A collection of RBIR projects and posts for anyone interested in joining this journey. - rewrite-bigdata-in-rust/RBIR
https://github.com/rewrite-bigdata-in-rust/RBIR
6 months ago
0
2
0
#european
#cloud
an interesting article why it hasn`t taken off yet
berthub.eu/articles/pos...
loading . . .
But how to get to that European cloud? - Bert Hubert
The very short version: It has now become clear that European governments can no longer rely on American clouds, and that we lack good and comprehensive alternatives. Market forces have failed to deli...
https://berthub.eu/articles/posts/now-how-to-get-that-european-cloud/
6 months ago
0
1
0
#finally
#oracle
begins to embrace
#dbt
the 23c release adds schema level grants: grant select/insert/update/delete any table on schema xxxx to user;
6 months ago
0
2
0
reposted by
Georg Heiler
Martin Kleppmann
7 months ago
Strongly agree ā the talk is a must-watch if you want to get better at writing
add a skeleton here at some point
4
105
14
It was an awesome evening tonight discussing
#open-data
at
#metalab
in
#vienna
georgheiler.com/event/open-d...
one key learning:
justizonline.gv.at/jop/web/iwg
and
www.data.gv.at/wp-content/u...
the company register tends to become more accessible
loading . . .
Open Data Hackathon Wien 25 | Georg Heiler
Principles of open data and data procesing
https://georgheiler.com/event/open-data-hackathon-25/
7 months ago
0
1
0
found a great video today about
#performance
of teams
www.youtube.com/watch?v=BsYI...
loading . . .
āThe Sociotechnical Path to High-Performing Teams (Begins With Observability)ā by Charity Majors
YouTube video by Performance Summit
https://www.youtube.com/watch?v=BsYIvi3Sae8
7 months ago
0
2
1
neat paper from
#microsoft
about
#ai
autmating
#data-analysis
www.youtube.com/watch?v=3ndl...
and
github.com/microsoft/da...
loading . . .
Data Formulator: Create Rich Visualization with AI iteratively
YouTube video by Microsoft Research
https://www.youtube.com/watch?v=3ndlwt0Wi3c
7 months ago
0
0
0
#timeseries
#clustering
#fails
arxiv.org/abs/2503.14393
a great paper
loading . . .
On the clustering behavior of sliding windows
Things can go spectacularly wrong when clustering timeseries data that has been preprocessed with a sliding window. We highlight three surprising failures that emerge depending on how the window size ...
https://arxiv.org/abs/2503.14393
7 months ago
0
2
1
Lookingforward to many interesting people at the Synthetic data for buildings and energy efficiency hackaton
loading . . .
Hackathon: Wattās Up? Hack for Energy Efficiency ā Synthetic Data for Buildings
Start date:22 Feb 2025, 09:00 (CET)Ā Ā Entry level:IntermediateEnd date:23 Feb 2025, 15:00 (CET)Ā Ā Subject area:Artificial IntelligenceLocation:TU Wien, AustriaĀ Ā Topics:Generation of syntheticā¦
https://buff.ly/3EJhLkl
7 months ago
0
0
0
What would you want to share with a younger version of yourself? What advice, concepts would have been relevant for you? Here is my list
georgheiler.com/post/learnin...
for aspiring data engineers
loading . . .
Upskilling data engineers | Georg Heiler
A comprehensive guide to modern data engineering with local-first development practices
https://georgheiler.com/post/learning-data-engineering/
7 months ago
1
0
0
#ai
caught
#cheating
time.com/7259395/ai-c...
loading . . .
When AI Thinks It Will Lose, It Sometimes Cheats
When sensing defeat in a match against a skilled chess bot, advanced models sometimes hack their opponent, a study found.
https://time.com/7259395/ai-chess-cheating-palisade-research/
7 months ago
0
0
0
Anyone in
#Vienna
fancy exchanging more thoughts about
#data
challenges? In particular,
#pipelines
and
#orchestration
?
lu.ma/dagster-dach
reach out if you are interested in joining - and thanks to
@dagster.io
the 1st round of beers is already funded.
loading . . .
Dagster User Group - DACH Ā· Events Calendar
View and subscribe to events from Dagster User Group - DACH on Luma.
https://lu.ma/dagster-dach
7 months ago
0
0
0
Load more
feeds!
log in