Bijil Subhash
@bijilsubhash.bsky.social
📤 101
📥 232
📝 74
Data Engineer, Recovering Academic, and Entrepreneur | bijilsubhash.io | Sydney, Australia
Databricks vs Fabric feels a lot like Pied Piper vs Nucleus. Fans of the Silicon Valley show will get the reference :)
#databs
#databricks
#fabric
7 months ago
0
2
0
(1/3) Among programming languages, I consider
#Python
to be a relatively easy to learn language, opening doors for many to start coding without formal training. However this also results in some poorly written, unmaintainable, and non-extensible code; the infamous spaghetti code.
8 months ago
1
1
0
Just finished watching the webinar on introducing SDF by dbt team. After seeing SDF in action, I have to admit that I am really looking forward to the future of dbt engine. I was wondering when dbt was going to bring in notable changes to the developer experience and this might be it.
#databs
8 months ago
0
3
1
I was speaking with someone who went all in on promoting duckdb to their clients. I did not get a chance to ask what exactly are they doing with duckdb. But I am curious to understand how duckdb is utilised in modern data pipelines.
9 months ago
0
1
0
(1/4) SDF acquisition by dbt If you work in data, you probably would have come across a version of this headline this past week. A small disclaimer, I have not used SDF and neither do I have solid understanding of the tech that sits behind it, so take what I say with a grain of salt.
9 months ago
1
0
0
(1/2) Maybe an unpopular opinion, SQL is a powerful language and despite what anyone says, it is unlikely to be replaced by an LLM, at least not with the models we have today. LLMs are powerful and can be leveraged to generate ideas or as a tool to unblock when you are stuck.
#databs
9 months ago
1
2
0
(1/3) Continuing from my previous thread on infrastructure as code for managing
#Databricks
. I have recently had the pleasure to work with an open source tool called Laktory, which is an abstraction that sits on top of Terraform/Pulumi to manage your Databricks workflow using YAML.
#databs
9 months ago
1
2
1
A default approach that I take when it comes to data modelling. It works because OBT is optimized for the modern vectorized data warehouses. At the same time, the underlying data is modelled using established best practices from Kimball.
add a skeleton here at some point
9 months ago
0
1
0
(1/2) Infrastructure as code (IaC) is ubiquitous in the data space. That being said, I have stayed away from doing any IaC work for as long as I can remember, mainly due to its aura of being difficult and also because I could pass the ball to the platform team.
#datasky
#databs
9 months ago
2
2
0
(1/4) What do you use for
#data
ingestion? Its true that there are no shortage of tools when it comes to data ingestion. But before you open the wallet to one of the many options out there, it might be worth doing a thorough due diligence based on your current and future needs.
#databs
9 months ago
1
0
0
I have been binging on the early chapters of the new book from
@joereis.bsky.social
on data modeling. I haven't consumed a lot of material on this topic besides Kimball but this one is a must read if you work with data in a modern context. Looking forward to the official release in 2025!
#databs
9 months ago
1
7
1
(1/5) dbt core vs dbt cloud dbt has been a game changer to many
#data
teams, mainly for writing reusable and version controlled transformation logic. We are also witnessing an explosion of tools that wants to become the next
#dbt
. What is better, dbt core or dbt cloud?
#databs
9 months ago
1
1
0
reposted by
Bijil Subhash
Andy Pavlo
9 months ago
Buckle up because we're banging into the new year with my annual retrospective of the last year in databases! Highlights include license change blowback, Databricks vs. Snowflake gangwar,
@duckdb.org
's shotgun weddings, and buying a quarterback to impress your lover:
www.cs.cmu.edu/~pavlo/blog/...
loading . . .
Databases in 2024: A Year in Review
Andy rises from the ashes of his dead startup and discusses what happened in 2024 in the database game.
https://www.cs.cmu.edu/~pavlo/blog/2025/01/2024-databases-retrospective.html
10
201
85
(1/5) What is Unity Catalog (UC) in the context of
#databricks
? The standard definition that you get is that it is a unified governance solution built into Databricks. It is accurate but that was not intuitive to me when I started building on UC. See 🧵 for some additional context on UC.
#databs
10 months ago
1
0
0
(1/2) Autoloader is without doubt one of my favourite feature in Databricks! In a nutshell, it is an abstraction that simplifies the incremental ingestion of data by monitoring the files that arrive in the cloud storage, supporting reliable and resilient data pipelines cost efficiently.
#databs
10 months ago
1
2
0
If you are an aspiring data engineer, do yourself a favour by learning the basics of data modelling amongst other fundamentals (Python and SQL) before jumping into whatever tool is on the headlines.
10 months ago
0
3
0
Great write up on LLM frameworks.
www.anthropic.com/research/bui...
Without attaching a name to it, I have tested all except the agent workflow in 2024. Currently running an orchestrator-worker, prompt chaining, and routing workflows across a handful of projects in production.
loading . . .
Building effective agents
A post for developers with advice and workflows for building effective AI agents
https://www.anthropic.com/research/building-effective-agents
10 months ago
1
0
1
I am sure some of us can relate to this.
#dataengineer
#data
#dataarchitecture
10 months ago
0
1
0
Five ways to copy a dictionary in Python: - Unpacking an iterable* - Use copy method* - Using dict constructor* - Dictionary comprehension* - Using deepcopy (from copy) First 3 are has similar performance, followed by dictionary comprehension, and finally deep copy. *shallow copy
10 months ago
0
1
0
Do we have any data engineers in the
#buildinpublic
community? If so, what are you building?
10 months ago
1
2
0
What is your go to analogy for explaining what a data engineer does? Check the🧵for the one that I use time to time.
10 months ago
1
1
0
you reached the end!!
feeds!
log in