Bijil Subhash
@bijilsubhash.bsky.social
📤 102
📥 232
📝 74
Data Engineer, Recovering Academic, and Entrepreneur | bijilsubhash.io | Sydney, Australia
Databricks vs Fabric feels a lot like Pied Piper vs Nucleus. Fans of the Silicon Valley show will get the reference :)
#databs
#databricks
#fabric
12 months ago
0
2
0
(1/3) Among programming languages, I consider
#Python
to be a relatively easy to learn language, opening doors for many to start coding without formal training. However this also results in some poorly written, unmaintainable, and non-extensible code; the infamous spaghetti code.
about 1 year ago
1
1
0
Just finished watching the webinar on introducing SDF by dbt team. After seeing SDF in action, I have to admit that I am really looking forward to the future of dbt engine. I was wondering when dbt was going to bring in notable changes to the developer experience and this might be it.
#databs
about 1 year ago
0
3
1
I was speaking with someone who went all in on promoting duckdb to their clients. I did not get a chance to ask what exactly are they doing with duckdb. But I am curious to understand how duckdb is utilised in modern data pipelines.
about 1 year ago
0
1
0
(1/4) SDF acquisition by dbt If you work in data, you probably would have come across a version of this headline this past week. A small disclaimer, I have not used SDF and neither do I have solid understanding of the tech that sits behind it, so take what I say with a grain of salt.
about 1 year ago
1
0
0
(1/2) Maybe an unpopular opinion, SQL is a powerful language and despite what anyone says, it is unlikely to be replaced by an LLM, at least not with the models we have today. LLMs are powerful and can be leveraged to generate ideas or as a tool to unblock when you are stuck.
#databs
about 1 year ago
1
2
0
(1/3) Continuing from my previous thread on infrastructure as code for managing
#Databricks
. I have recently had the pleasure to work with an open source tool called Laktory, which is an abstraction that sits on top of Terraform/Pulumi to manage your Databricks workflow using YAML.
#databs
about 1 year ago
1
2
1
A default approach that I take when it comes to data modelling. It works because OBT is optimized for the modern vectorized data warehouses. At the same time, the underlying data is modelled using established best practices from Kimball.
add a skeleton here at some point
about 1 year ago
0
1
0
(1/2) Infrastructure as code (IaC) is ubiquitous in the data space. That being said, I have stayed away from doing any IaC work for as long as I can remember, mainly due to its aura of being difficult and also because I could pass the ball to the platform team.
#datasky
#databs
about 1 year ago
2
2
0
(1/4) What do you use for
#data
ingestion? Its true that there are no shortage of tools when it comes to data ingestion. But before you open the wallet to one of the many options out there, it might be worth doing a thorough due diligence based on your current and future needs.
#databs
about 1 year ago
1
0
0
I have been binging on the early chapters of the new book from
@joereis.bsky.social
on data modeling. I haven't consumed a lot of material on this topic besides Kimball but this one is a must read if you work with data in a modern context. Looking forward to the official release in 2025!
#databs
about 1 year ago
1
7
1
(1/5) dbt core vs dbt cloud dbt has been a game changer to many
#data
teams, mainly for writing reusable and version controlled transformation logic. We are also witnessing an explosion of tools that wants to become the next
#dbt
. What is better, dbt core or dbt cloud?
#databs
about 1 year ago
1
1
0
reposted by
Bijil Subhash
Andy Pavlo
about 1 year ago
Buckle up because we're banging into the new year with my annual retrospective of the last year in databases! Highlights include license change blowback, Databricks vs. Snowflake gangwar,
@duckdb.org
's shotgun weddings, and buying a quarterback to impress your lover:
www.cs.cmu.edu/~pavlo/blog/...
loading . . .
Databases in 2024: A Year in Review
Andy rises from the ashes of his dead startup and discusses what happened in 2024 in the database game.
https://www.cs.cmu.edu/~pavlo/blog/2025/01/2024-databases-retrospective.html
10
199
83
(1/5) What is Unity Catalog (UC) in the context of
#databricks
? The standard definition that you get is that it is a unified governance solution built into Databricks. It is accurate but that was not intuitive to me when I started building on UC. See 🧵 for some additional context on UC.
#databs
about 1 year ago
1
0
0
(1/2) Autoloader is without doubt one of my favourite feature in Databricks! In a nutshell, it is an abstraction that simplifies the incremental ingestion of data by monitoring the files that arrive in the cloud storage, supporting reliable and resilient data pipelines cost efficiently.
#databs
about 1 year ago
1
2
0
If you are an aspiring data engineer, do yourself a favour by learning the basics of data modelling amongst other fundamentals (Python and SQL) before jumping into whatever tool is on the headlines.
about 1 year ago
0
3
0
Great write up on LLM frameworks.
www.anthropic.com/research/bui...
Without attaching a name to it, I have tested all except the agent workflow in 2024. Currently running an orchestrator-worker, prompt chaining, and routing workflows across a handful of projects in production.
loading . . .
Building effective agents
A post for developers with advice and workflows for building effective AI agents
https://www.anthropic.com/research/building-effective-agents
about 1 year ago
1
0
1
I am sure some of us can relate to this.
#dataengineer
#data
#dataarchitecture
about 1 year ago
0
1
0
Five ways to copy a dictionary in Python: - Unpacking an iterable* - Use copy method* - Using dict constructor* - Dictionary comprehension* - Using deepcopy (from copy) First 3 are has similar performance, followed by dictionary comprehension, and finally deep copy. *shallow copy
about 1 year ago
0
1
0
Do we have any data engineers in the
#buildinpublic
community? If so, what are you building?
about 1 year ago
1
2
0
What is your go to analogy for explaining what a data engineer does? Check the🧵for the one that I use time to time.
about 1 year ago
1
1
0
you reached the end!!
feeds!
log in