Matei Zaharia
@matei-zaharia.bsky.social
š¤ 439
š„ 72
š 15
CTO at Databricks and CS professor at UC Berkeley.
https://people.eecs.berkeley.edu/~matei/
reposted by
Matei Zaharia
Andrew Drozdov
9 months ago
We built a thing! The Databricks Reranker is now in Public Preview. It's as easy as changing the arguments to your vector search call, and doesn't require any additional setup. Read more:
www.databricks.com/blog/reranki...
loading . . .
Reranking in Mosaic AI Vector Search for Faster, Smarter Retrieval in RAG Agents
Boost RAG agent quality with rerankingādeliver more relevant answers in less time with a single parameter in Mosaic AI Vector Search.
https://www.databricks.com/blog/reranking-mosaic-ai-vector-search-faster-smarter-retrieval-rag-agents
0
4
1
Excited to launch Agent Bricks, a new way to build auto-optimized agents on your tasks. Agent Bricks uniquely takes a *declarative* approach to agent development: you tell us what you want, and we auto-generate evals and optimize the agent.
www.databricks.com/blog/introdu...
loading . . .
Introducing Agent Bricks: Auto-Optimized Agents Using Your Data
Discover Agent Bricks by Databricks ā a new way to build production-ready AI agents using your data. Automatically evaluate, optimize, and scale agents with higher accuracy and lower cost.
https://www.databricks.com/blog/introducing-agent-bricks
11 months ago
1
2
0
Apache Spark 4.0 is out with some huge improvements across the board. SQLās much more powerful, Spark Connect makes it easier to run apps, new languages and more. Itās amazing to see the community still growing fast and releasing over 5000 patches in 4.0.
www.databricks.com/blog/introdu...
loading . . .
Introducing Apache Spark 4.0
Explore Apache Spark 4.0's key updates: advanced SQL features, improved Python support, enhanced streaming, and productivity boosts for big data analytics.
https://www.databricks.com/blog/introducing-apache-spark-40
12 months ago
0
4
1
#MLSys
2025 is next week! You can still register at
mlsys.org
.
about 1 year ago
0
2
0
Nice results on never-ending learning for code editing. We believe that a lot of AI applications will be customizable this way (to every company's codebase, users, etc). The combined AI serving, data and MLOps environment on Databricks makes these easy to build.
www.databricks.com/blog/power-f...
loading . . .
The Power of Fine-Tuning on Your Data: Quick Fixing Bugs with LLMs via Never Ending Learning (NEL)
Discover how fine-tuning small open-source LLMs on interaction data enables faster, cheaper, and more accurate code fixes with Databricks Quick Fix.
https://www.databricks.com/blog/power-fine-tuning-your-data-quick-fixing-bugs-llms-never-ending-learning-nel
about 1 year ago
0
2
0
reposted by
Matei Zaharia
MLflow
about 1 year ago
š„ New Video: Get Hands-On with MLflow Tracing! In this video,
@danliden.com
walks through how
#MLflow
Tracing boosts observability in
#GenAI
appsāgreat for debugging, experimentation & organizing data workflows. Watch now ā”ļø
www.youtube.com/watch?v=iRbB...
#opensource
#oss
loading . . .
MLflow Tracing | Introduction & Tutorial
YouTube video by MLflow
https://www.youtube.com/watch?v=iRbB4yIm3ag&t=29s
0
6
2
Really cool result from the Databricks research team: You can tune LLMs for a task *without data labels*, using test-time compute and RL, and outperform supervised fine-tuning! Our new TAO method scales with compute to produce fast, high-quality models.
www.databricks.com/blog/tao-usi...
loading . . .
TAO: Using test-time compute to train efficient LLMs without labeled data
LIFT fine-tunes LLMs without labels using reinforcement learning, boosting performance on enterprise tasks.
https://www.databricks.com/blog/tao-using-test-time-compute-train-efficient-llms-without-labeled-data
about 1 year ago
1
2
0
The
#MLSys2025
program is up and registration is open! Check out accepted papers at
mlsys.org/virtual/2025...
and sign up to attend at
mlsys.org/Register
.
about 1 year ago
0
2
0
reposted by
Matei Zaharia
MLflow
about 1 year ago
Exciting newsāMLflow 2.21.0 is live! š This release includes significant features, enhancements, and bug fixes to improve documentation,
#GenAI
prompt management, tracing & more. š Explore all the new features & improvements:
mlflow.org/releases/2.2...
#opensource
#oss
#mlflow
0
2
2
reposted by
Matei Zaharia
Lakshya A Agrawal
about 1 year ago
š§µIntroducing LangProBe: the first benchmark testing where and how composing LLMs into language programs affects cost-quality tradeoffs! We find that, on avg across diverse tasks, smaller models within optimized programs beat calls to larger models at a fraction of the cost.
1
6
5
reposted by
Matei Zaharia
Andrew Drozdov
about 1 year ago
We're probably a little too obsessed with zero-shot retrieval. If you have documents (you do), then you can generate synthetic data, and finetune your embedding. Blog post lead by
@jacobianneuro.bsky.social
shows how well this works in practice.
www.databricks.com/blog/improvi...
loading . . .
Improving Retrieval and RAG with Embedding Model Finetuning
Fine-tune embedding models on Databricks to enhance retrieval and RAG accuracy with synthetic dataāno manual labeling required.
https://www.databricks.com/blog/improving-retrieval-and-rag-embedding-model-finetuning
1
9
5
reposted by
Matei Zaharia
SAP
over 1 year ago
We're bringing in a new era of enterprise data management and agentic AI with SAP Business Data Cloud with Databricks. ā Unifies your SAP and non-SAP data ā Natively embeds Databricks technology ā AI agents streamline workflows Learn more:
sap.to/sapbdc
0
10
5
Sponsor registration is open for
#MLSys
2025. We have the most submissions ever to MLSys so it promises to be a great conference!
mlsys.org/Sponsors/spo...
loading . . .
2025 Sponsor / Exhibitor Information
https://mlsys.org/Sponsors/sponsorinfo
over 1 year ago
1
3
1
reposted by
Matei Zaharia
TechCrunch
over 1 year ago
Researchers open source Sky-T1, a āreasoningā AI model that can be trained for less than $450
loading . . .
Researchers open source Sky-T1, a āreasoningā AI model that can be trained for less than $450
So-called reasoning AI models are becoming easier ā and cheaper ā to develop. On Friday, NovaSky, a team of researchers based out of UC Berkeleyās Sky Computing Lab, released Sky-T1-32B-Preview, a reasoning model thatās competitive with an earlier versionā¦
https://tcrn.ch/3C3VNYd
6
78
24
reposted by
Matei Zaharia
Sebastian Raschka (rasbt)
over 1 year ago
"Sky-T1-32B-Preview, our reasoning model that performs on par with o1-preview on popular reasoning and coding benchmarks." That was quick! Is this already the Alpaca moment for reasoning models? Source:
novasky-ai.github.io/posts/sky-t1/
3
39
8
Congrats to Meta on releasing Llama 3.3, a 70B model that matches the performance of Llama-405B! Open weight models are advancing so rapidly and the cost to get this performance is quickly going down. We're thrilled to let users serve & customize this on Databricks.
huggingface.co/meta-llama/L...
loading . . .
meta-llama/Llama-3.3-70B-Instruct Ā· Hugging Face
Weāre on a journey to advance and democratize artificial intelligence through open source and open science.
https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct
over 1 year ago
0
8
0
reposted by
Matei Zaharia
Chris Potts
over 1 year ago
Compound AI Systems, Inference-time Compute Meetup @ NeurIPS 2024, with many AI luminaries as panelists. Poster submissions are open:
lu.ma/q5r8b67t
loading . . .
Compound AI Systems, Inference-time Compute Meetup @ NeurIPS 2024 Ā· Luma
Meetup for practitioners and researchers working on and interested in compound AI systems, inference-time strategies and scaling laws, networks of networks,ā¦
https://lu.ma/q5r8b67t
0
14
5
you reached the end!!
feeds!
log in