Matei Zaharia’s Post

CTO & Cofounder at Databricks, CS Professor at Berkeley

4mo

At Databricks, we've built an awesome model training and tuning stack. We now used it to release DBRX, the best open source LLM on standard benchmarks to date, exceeding GPT-3.5 while running 2x faster than Llama-70B: https://lnkd.in/ghPNSRyw DBRX is a 132B parameter MoE model with 36B active params and fine-grained (4-of-16) sparsity and 32K context. It was trained using the dropless MoE approach pioneered by my student Trevor Gale in MegaBlocks (and based on his code). Technical details: https://lnkd.in/ghPNSRyw You can try DBRX on Databricks model serving and playground today, with an OpenAI-compatible API! And most importantly, if you want to tune, RLHF or train your own model, we have everything we needed to build this from scratch.

59 Comments

Mohammad Khadra, MBA

🌟 CTO & Co-Founder @ Tachyhealth | Digital Transformation Leader | AI & Analytics Strategist🌟

4mo

Hi Matei Zaharia is there a way to use databricks on an on-pren settings?

Edmondo Porcu

Distinguished Engineer @ Capital One

4mo

This is absolutely impressive

9 Reactions

Gourav Sengupta

Head - Data Engineering, Quality, Operations, and Knowledge

4mo

Beautiful 😃 Matei Zaharia so RHLF framework is now available in databricks?

1 Reaction

Abhi Mishra

CTO @ Human Interest

4mo

> And most importantly, if you want to tune, RLHF or train your own model, we have everything we needed to build this from scratch. Matei Zaharia Impressive results. I might be missing something, but are you releasing the training source code and datasets (or pointers to them) as well? You seem to be saying that everything needed to build this from scratch is available, but I didn't see those details in the blog post or Github or Hugging Face.

Karl Asger Juhl

AI Engineer @ The Motley Fool 🃏

4mo

Matei Zaharia Thanks for showing this! Now let's get nice observability views for compound systems & also some DSPy examples in databricks!

1 Reaction

Bijit Ghosh

CTO - Global Head of Cloud Product & Engineering & AI/ML - at Deutsche Bank

4mo

Congrats! Are you folks planning to explore smaller architectures as well? I'm quite curious the scaling laws of those 12T tokens! Also 132b models aren't affordable to run by any non-entreprise company.

1 Reaction

Kingsley Uyi Idehen

Founder & CEO at OpenLink Software | Advancing Data Connectivity, Multi-Model Data Management, and AI Smart Agents | Unifying Disparate Data Silos via Open Standards (SQL, SPARQL, RDF, ODBC, JDBC, HTTP, GraphQL)

4mo

How do I test this? For instance, is there a generally accessible Chat interface similar to what the likes of #ChatGPT, #Mistral, #Claude3 etc. offer?

Matt Johnson

4mo

Congratulations! The HumanEval score of 70.1% is especially impressive! I’m looking forward to seeing how the community builds on this model.

5 Reactions

Javier Mayorgas Cobos

Head of AI at Cívica

4mo

That’s sound amazing! 🤩

5 Reactions

Tyler Lynch

@AWS Cloud ☁️ Polyglot Solutions Architect, Team Enabler, Enterprise Strategist, Terraform Core Contributor, Networking Automation Engineer.

4mo

The team at Databricks has always set the standard for delivery and execution high. I’m excited to see how this new area of depth helps your customers unlock new potential.

1 Reaction

See more comments

To view or add a comment, sign in

More Relevant Posts

Balakrishna Narasimhan

Marketing Insights & Strategy at Google | All/AI views are my own
4mo Edited
Report this post
Huge announcement from Databricks that should be getting more buzz. With a relatively small investment (reportedly ~$10M and 2 months of training), they've created an open source model, DBRX, that outperforms others and is much more efficient. With open source models getting better, more efficient and easier to manage/tune, a few questions come to mind: 1) Will differentiation or value capture happen higher up the stack, e.g., at the application, agent or orchestration layers, vs at the model level? 2) When should an enterprise choose a proprietary model vs an open one? For less strategic use cases, if proprietary models are much easier to manage and cheaper? Or alternatively, for specific use cases if proprietary models maintain their performance advantage? 3) If efficiency and tunability of general purpose open source models keep improving, what role will segmented models play outside of super-specialized use cases? 4) If you're a developer creating a new application, why create your own model or build on a proprietary model vs an open source model? Of course, as Ethan Mollick has pointed out, the largest foundation models still outperform tuned models (https://lnkd.in/gBeySCJd) but I wonder how long they can maintain this lead. Look forward to hearing what you all think!

Matei Zaharia

CTO & Cofounder at Databricks, CS Professor at Berkeley
4mo

At Databricks, we've built an awesome model training and tuning stack. We now used it to release DBRX, the best open source LLM on standard benchmarks to date, exceeding GPT-3.5 while running 2x faster than Llama-70B: https://lnkd.in/ghPNSRyw DBRX is a 132B parameter MoE model with 36B active params and fine-grained (4-of-16) sparsity and 32K context. It was trained using the dropless MoE approach pioneered by my student Trevor Gale in MegaBlocks (and based on his code). Technical details: https://lnkd.in/ghPNSRyw You can try DBRX on Databricks model serving and playground today, with an OpenAI-compatible API! And most importantly, if you want to tune, RLHF or train your own model, we have everything we needed to build this from scratch.

2 Comments
Like Comment
To view or add a comment, sign in
Sunil K. Prasad
3mo
Report this post
Very impressive to see DBRX is a 132B parameter MoE model with 36B active params and fine-grained (4-of-16) sparsity and 32K context.

Matei Zaharia

CTO & Cofounder at Databricks, CS Professor at Berkeley
4mo

At Databricks, we've built an awesome model training and tuning stack. We now used it to release DBRX, the best open source LLM on standard benchmarks to date, exceeding GPT-3.5 while running 2x faster than Llama-70B: https://lnkd.in/ghPNSRyw DBRX is a 132B parameter MoE model with 36B active params and fine-grained (4-of-16) sparsity and 32K context. It was trained using the dropless MoE approach pioneered by my student Trevor Gale in MegaBlocks (and based on his code). Technical details: https://lnkd.in/ghPNSRyw You can try DBRX on Databricks model serving and playground today, with an OpenAI-compatible API! And most importantly, if you want to tune, RLHF or train your own model, we have everything we needed to build this from scratch.

1 Comment
Like Comment
To view or add a comment, sign in
Alexander Terado

Solutions Architect at Databricks
4mo Edited
Report this post
The most valuable takeaway: this was built end-to-end in Databricks. With our training and tuning stack you and your company could build an even better model. Try it out for yourself!

Matei Zaharia

CTO & Cofounder at Databricks, CS Professor at Berkeley
4mo

At Databricks, we've built an awesome model training and tuning stack. We now used it to release DBRX, the best open source LLM on standard benchmarks to date, exceeding GPT-3.5 while running 2x faster than Llama-70B: https://lnkd.in/ghPNSRyw DBRX is a 132B parameter MoE model with 36B active params and fine-grained (4-of-16) sparsity and 32K context. It was trained using the dropless MoE approach pioneered by my student Trevor Gale in MegaBlocks (and based on his code). Technical details: https://lnkd.in/ghPNSRyw You can try DBRX on Databricks model serving and playground today, with an OpenAI-compatible API! And most importantly, if you want to tune, RLHF or train your own model, we have everything we needed to build this from scratch.
Like Comment
To view or add a comment, sign in
Jiri Harazim 🇺🇦

Data+AI evangelist in Manufacturing & Automotive. Personal account; views my own.
4mo
Report this post
Some behind-the-scene insights on #DBRX:

Matei Zaharia

CTO & Cofounder at Databricks, CS Professor at Berkeley
4mo

At Databricks, we've built an awesome model training and tuning stack. We now used it to release DBRX, the best open source LLM on standard benchmarks to date, exceeding GPT-3.5 while running 2x faster than Llama-70B: https://lnkd.in/ghPNSRyw DBRX is a 132B parameter MoE model with 36B active params and fine-grained (4-of-16) sparsity and 32K context. It was trained using the dropless MoE approach pioneered by my student Trevor Gale in MegaBlocks (and based on his code). Technical details: https://lnkd.in/ghPNSRyw You can try DBRX on Databricks model serving and playground today, with an OpenAI-compatible API! And most importantly, if you want to tune, RLHF or train your own model, we have everything we needed to build this from scratch.
Like Comment
To view or add a comment, sign in
Greg Broadbent
4mo Edited
Report this post
Bigger, Faster, Cheaper - Pick 3 😀 Today Databricks releases DBRX, the most powerful open source model available, and Will Knight from WIRED does a great job of explaining why it's such a big deal: https://lnkd.in/gxbxCbr5 The same tools #Databricks used to build this model are available for you to build your own custom model using (and maintaining control of) your data.

Matei Zaharia

CTO & Cofounder at Databricks, CS Professor at Berkeley
4mo

At Databricks, we've built an awesome model training and tuning stack. We now used it to release DBRX, the best open source LLM on standard benchmarks to date, exceeding GPT-3.5 while running 2x faster than Llama-70B: https://lnkd.in/ghPNSRyw DBRX is a 132B parameter MoE model with 36B active params and fine-grained (4-of-16) sparsity and 32K context. It was trained using the dropless MoE approach pioneered by my student Trevor Gale in MegaBlocks (and based on his code). Technical details: https://lnkd.in/ghPNSRyw You can try DBRX on Databricks model serving and playground today, with an OpenAI-compatible API! And most importantly, if you want to tune, RLHF or train your own model, we have everything we needed to build this from scratch.
Like Comment
To view or add a comment, sign in
Mrityunjay Kumar

Data-Philia-ist @ Databricks | Empowering data teams to solve world’s toughest problems !!
4mo
Report this post
Open source LLM model from Databricks which is cheaper, smaller, faster and more effective than other Open Source LLMs out there. More importantly, you want to fine tune, RLHF or train your own model, we have everything we needed to build this from scratch. Check it out 👇

Matei Zaharia

CTO & Cofounder at Databricks, CS Professor at Berkeley
4mo

At Databricks, we've built an awesome model training and tuning stack. We now used it to release DBRX, the best open source LLM on standard benchmarks to date, exceeding GPT-3.5 while running 2x faster than Llama-70B: https://lnkd.in/ghPNSRyw DBRX is a 132B parameter MoE model with 36B active params and fine-grained (4-of-16) sparsity and 32K context. It was trained using the dropless MoE approach pioneered by my student Trevor Gale in MegaBlocks (and based on his code). Technical details: https://lnkd.in/ghPNSRyw You can try DBRX on Databricks model serving and playground today, with an OpenAI-compatible API! And most importantly, if you want to tune, RLHF or train your own model, we have everything we needed to build this from scratch.
Like Comment
To view or add a comment, sign in
Jacqueline Williams

Enterprise Account Executive at Databricks
4mo
Report this post
Not only 2x faster but 1/2 the cost of Llama-2. 🚀

Matei Zaharia

CTO & Cofounder at Databricks, CS Professor at Berkeley
4mo

At Databricks, we've built an awesome model training and tuning stack. We now used it to release DBRX, the best open source LLM on standard benchmarks to date, exceeding GPT-3.5 while running 2x faster than Llama-70B: https://lnkd.in/ghPNSRyw DBRX is a 132B parameter MoE model with 36B active params and fine-grained (4-of-16) sparsity and 32K context. It was trained using the dropless MoE approach pioneered by my student Trevor Gale in MegaBlocks (and based on his code). Technical details: https://lnkd.in/ghPNSRyw You can try DBRX on Databricks model serving and playground today, with an OpenAI-compatible API! And most importantly, if you want to tune, RLHF or train your own model, we have everything we needed to build this from scratch.

1 Comment
Like Comment
To view or add a comment, sign in
Jonas Sommer

Early Growth Equity @ DTCF | Ex-Palantir | UC Berkeley
3mo
Report this post
The Speed with which OpenSource LLMs are catching up is exciting: The new Open Source LLM DBRX by Databricks beats GPT 3.5 and is the best model rn in the Hugging Face Open LLM Leaderboard (before MistralAI). And in July Llama3 by META will be coming out probably setting another level.

Matei Zaharia

CTO & Cofounder at Databricks, CS Professor at Berkeley
4mo

At Databricks, we've built an awesome model training and tuning stack. We now used it to release DBRX, the best open source LLM on standard benchmarks to date, exceeding GPT-3.5 while running 2x faster than Llama-70B: https://lnkd.in/ghPNSRyw DBRX is a 132B parameter MoE model with 36B active params and fine-grained (4-of-16) sparsity and 32K context. It was trained using the dropless MoE approach pioneered by my student Trevor Gale in MegaBlocks (and based on his code). Technical details: https://lnkd.in/ghPNSRyw You can try DBRX on Databricks model serving and playground today, with an OpenAI-compatible API! And most importantly, if you want to tune, RLHF or train your own model, we have everything we needed to build this from scratch.
Like Comment
To view or add a comment, sign in
Todd Cowles

Sr. Solutions Architect at Databricks
4mo
Report this post
And there’s this, Mixture of Experts = model quality + GPU efficiency. In addition to the below, read “MegaBlocks: Efficient Sparse Training with Mixture of Experts.”

Matei Zaharia

CTO & Cofounder at Databricks, CS Professor at Berkeley
4mo

At Databricks, we've built an awesome model training and tuning stack. We now used it to release DBRX, the best open source LLM on standard benchmarks to date, exceeding GPT-3.5 while running 2x faster than Llama-70B: https://lnkd.in/ghPNSRyw DBRX is a 132B parameter MoE model with 36B active params and fine-grained (4-of-16) sparsity and 32K context. It was trained using the dropless MoE approach pioneered by my student Trevor Gale in MegaBlocks (and based on his code). Technical details: https://lnkd.in/ghPNSRyw You can try DBRX on Databricks model serving and playground today, with an OpenAI-compatible API! And most importantly, if you want to tune, RLHF or train your own model, we have everything we needed to build this from scratch.
Like Comment
To view or add a comment, sign in
Josh Faure

Sports & Startups (APJ) @ Databricks
4mo
Report this post
How good?! Not only can you use Databricks as a unified platform that can handle all of the data use cases, including being a platform for #RAG - leveraging some amazing external models such as #LlaMa2 or #Mixtral. But now you get immediate out of the box access to #DBRX, a model that outperforms all those common models, at a fraction of the cost!!

Matei Zaharia

CTO & Cofounder at Databricks, CS Professor at Berkeley
4mo

At Databricks, we've built an awesome model training and tuning stack. We now used it to release DBRX, the best open source LLM on standard benchmarks to date, exceeding GPT-3.5 while running 2x faster than Llama-70B: https://lnkd.in/ghPNSRyw DBRX is a 132B parameter MoE model with 36B active params and fine-grained (4-of-16) sparsity and 32K context. It was trained using the dropless MoE approach pioneered by my student Trevor Gale in MegaBlocks (and based on his code). Technical details: https://lnkd.in/ghPNSRyw You can try DBRX on Databricks model serving and playground today, with an OpenAI-compatible API! And most importantly, if you want to tune, RLHF or train your own model, we have everything we needed to build this from scratch.
Like Comment
To view or add a comment, sign in

72,718 followers

783 Posts

View Profile Follow

Matei Zaharia’s Post

More Relevant Posts

Explore topics