Hasan Geren
@hgeren.bsky.social
π€ 39
π₯ 87
π 9
Data Engineer π§π»βπ» Stream Processing Researcher π¬ Nerd π€ Metalhead π€π»
reposted by
Hasan Geren
Pipeline To Insights
10 months ago
Data ingestion with dlt and Dagster: An end-to-end pipeline tutorial: Curious like us to see what people are sharing with
#dataBS
and
#datasky
? Check out this post to learn how to do it using dlt!"
@matthausk.bsky.social
@datateam.bsky.social
@hgeren.bsky.social
@hopefanhe.bsky.social
#dlt
loading . . .
Data ingestion with dlt and Dagster: An end-to-end pipeline tutorial
Ingest Data from Bluesky API to AWS S3 Using dlt and deploy it on Dagster in Just 15 Minutes.
https://open.substack.com/pub/pipeline2insights/p/data-ingestion-with-dlt-bluesky-to-s3-on-dagster?r=p5bpr&utm_campaign=post&utm_medium=web&showWelcomeOnShare=true
0
9
1
reposted by
Hasan Geren
Pipeline To Insights
10 months ago
We are starting a 32-week Data Engineering Interview Guide program, covering everything from fundamentals to advanced topics, with sessions every Saturday. Do you think we're missing any critical topics? We're curious about your opinionsπ
#dataBS
#datasky
loading . . .
Week 0/32 - A Comprehensive Data Engineering Interview Preparation Guide
Join us every Saturday on This New Journey
https://open.substack.com/pub/pipeline2insights/p/week-032-a-comprehensive-data-engineering?utm_source=app-post-stats-page&r=p5bpr&utm_medium=ios
0
4
3
reposted by
Hasan Geren
Pipeline To Insights
10 months ago
As a Data Engineer, understanding the data storage lifecycle and data retention policies is critical for designing efficient, cost-effective, and compliant data systems.
@joereis.bsky.social
#dataBS
#datasky
substack.com/@pipeline2in...
loading . . .
0
7
2
reposted by
Hasan Geren
Pipeline To Insights
10 months ago
In our new post, we've covered 10 of the most popular data pipeline design patterns. Weβd love to hear your thoughts. For more details, please check out the full post created by (
@hgeren.bsky.social
and
@hopefanhe.bsky.social
):
open.substack.com/pub/pipeline...
#dataBS
#datasky
loading . . .
10 Pipeline Design Patterns for Data Engineers
How to leverage Design Patterns for scalable and efficient data pipelines
https://open.substack.com/pub/pipeline2insights/p/10-pipeline-design-patterns-for-data?r=p5bpr&utm_campaign=post&utm_medium=web
0
3
2
reposted by
Hasan Geren
Pipeline To Insights
10 months ago
Discover how dlt simplifies data ingestion. Learn its origins and real-world use cases. Follow a step-by-step guide to build your first pipeline and join the growing dlt community!
@matthausk.bsky.social
@datateam.bsky.social
@hgeren.bsky.social
@hopefanhe.bsky.social
#dataBS
#datasky
loading . . .
Introduction to data load tool (dlt): A Python Library for Simple Data Ingestion
Discover the basics of dlt and its role in modern data engineering workflows
https://open.substack.com/pub/pipeline2insights/p/introduction-to-data-load-tool-dlt?utm_source=app-post-stats-page&r=p5bpr&utm_medium=ios
2
9
3
reposted by
Hasan Geren
Pipeline To Insights
11 months ago
Hi, wishing everyone a great Thanksgiving! Recently we wrote about how SQL queries are executed behind the scenes. If you are interested, check out our post:
open.substack.com/pub/pipeline...
#dataBS
#datasky
loading . . .
0
6
2
reposted by
Hasan Geren
Pipeline To Insights
11 months ago
Storage is at the heart of Data Engineering. In this post, we explore the hierarchy of data storage from the ground up, drawing inspiration from Fundamentals of Data Engineering by
@joereis.bsky.social
and Matt Housley, as well as insights from the DE Professionals on Coursera.
#dataBS
#datasky
loading . . .
Storage Fundamentals For Data Engineers
Why organised and durable storage is the cornerstone of Data Engineering?
https://open.substack.com/pub/pipeline2insights/p/storage-fundamentals-every-data-engineer?utm_source=app-post-stats-page&r=p5bpr&utm_medium=ios
3
16
2
Hey
#dataBS
and
#datasky
folks, Our new post about "how understanding Big O Notation & Execution Plans can optimize SQL queries" has just been posted. Check it out if you're interested, and we'd love to hear your thoughts!
@hopefanhe.bsky.social
open.substack.com/pub/pipeline...
loading . . .
SQL Behind the Curtain: How Are Queries Executed?
Explore the journey of your SQL query guided by execution plans
https://open.substack.com/pub/pipeline2insights/p/sql-behind-the-curtain-how-are-queries?r=p5bpr&utm_campaign=post&utm_medium=web
11 months ago
1
8
2
Hey
#dataBS
, I've been thinking of an analogy for Data Teams' roles. Imagine a company as a vehicle. How would you map Data Engineering, Analytics, and Science to vehicle parts? Teams could have multiple parts or overlap with other Teams. Curious about your thoughts!
11 months ago
2
4
0
reposted by
Hasan Geren
Adam Marcus
11 months ago
Looking for a distraction? Try this great interview between
@hannes.muehleisen.org
and
@medriscoll.bsky.social
covering all things
@duckdb.org
. I especially enjoyed the philosophy around improving SQL usability.
www.youtube.com/watch?v=a-Rm...
#databs
loading . . .
Data Talks on the Rocks 5 - Hannes MΓΌhleisen, DuckDB
YouTube video by Rill Data
https://www.youtube.com/watch?v=a-RmhY5RPVg
0
14
4
reposted by
Hasan Geren
Christian Minich
11 months ago
#dstaBS
can you repost? Filled up the first 150 and so am creating a second starter pack! Letβs all keep finding each other and make this place the best for all things data
add a skeleton here at some point
2
13
5
reposted by
Hasan Geren
Erfan Hesami
11 months ago
Week 1 of "100 Days of SQL Optimisation" covered key techniques like column selection, multicolumn indexes, filtering, window functions, Rank, CTE and composite indexes with IMDb data. Check out the full post for more!
@hgeren.bsky.social
#dataBS
#datasky
loading . . .
Week #1: 100 Days of SQL Optimisation
How Small Tweaks Transformed Our Queries, Saving Time and Resources
https://open.substack.com/pub/pipeline2insights/p/week-1-100-days-of-sql-optimisation?utm_source=app-post-stats-page&r=p5bpr&utm_medium=ios
0
6
1
reposted by
Hasan Geren
Chris
12 months ago
I made an infra engineer starter pack. Folks posting about databases, stream processing, durable execution, orchestrators, service meshes, and more.
go.bsky.app/SCZe42X
add a skeleton here at some point
44
290
91
Hello everyone! Iβm Hasan. I transitioned from Industrial Engineering to Data Science, then found my passion in Data Engineering. Currently, doing a PhD in distributed stream processing while working as a Data Engineer. Looking forward to connecting with fellow data enthusiasts to learn and share.
11 months ago
0
3
0
Just joined and heard
#dataBS
and
#datasky
are where the cool kids hang. Wanted to introduce our blog where we regularly write about Data Engineering concepts, news, and tools.
pipeline2insights.substack.com
11 months ago
2
15
3
you reached the end!!
feeds!
log in