Matei Zaharia’s Post

View profile for Matei Zaharia, graphic

CTO & Cofounder at Databricks, CS Professor at Berkeley

At Databricks, we've built an awesome model training and tuning stack. We now used it to release DBRX, the best open source LLM on standard benchmarks to date, exceeding GPT-3.5 while running 2x faster than Llama-70B: https://lnkd.in/ghPNSRyw DBRX is a 132B parameter MoE model with 36B active params and fine-grained (4-of-16) sparsity and 32K context. It was trained using the dropless MoE approach pioneered by my student Trevor Gale in MegaBlocks (and based on his code). Technical details: https://lnkd.in/ghPNSRyw You can try DBRX on Databricks model serving and playground today, with an OpenAI-compatible API! And most importantly, if you want to tune, RLHF or train your own model, we have everything we needed to build this from scratch.

Mohammad Khadra, MBA

🌟 CTO & Co-Founder @ Tachyhealth | Digital Transformation Leader | AI & Analytics Strategist🌟

4mo

Hi Matei Zaharia is there a way to use databricks on an on-pren settings?

Like
Reply
Edmondo Porcu

Distinguished Engineer @ Capital One

4mo

This is absolutely impressive

Gourav Sengupta

Head - Data Engineering, Quality, Operations, and Knowledge

4mo

Beautiful 😃 Matei Zaharia so RHLF framework is now available in databricks?

Abhi Mishra

CTO @ Human Interest

4mo

> And most importantly, if you want to tune, RLHF or train your own model, we have everything we needed to build this from scratch. Matei Zaharia Impressive results. I might be missing something, but are you releasing the training source code and datasets (or pointers to them) as well? You seem to be saying that everything needed to build this from scratch is available, but I didn't see those details in the blog post or Github or Hugging Face.

Like
Reply
Karl Asger Juhl

AI Engineer @ The Motley Fool 🃏

4mo

Matei Zaharia Thanks for showing this! Now let's get nice observability views for compound systems & also some DSPy examples in databricks!

Bijit Ghosh

CTO - Global Head of Cloud Product & Engineering & AI/ML - at Deutsche Bank

4mo

Congrats! Are you folks planning to explore smaller architectures as well? I'm quite curious the scaling laws of those 12T tokens! Also 132b models aren't affordable to run by any non-entreprise company.

Kingsley Uyi Idehen

Founder & CEO at OpenLink Software | Advancing Data Connectivity, Multi-Model Data Management, and AI Smart Agents | Unifying Disparate Data Silos via Open Standards (SQL, SPARQL, RDF, ODBC, JDBC, HTTP, GraphQL)

4mo

How do I test this? For instance, is there a generally accessible Chat interface similar to what the likes of #ChatGPT, #Mistral, #Claude3 etc. offer?

Like
Reply

Congratulations! The HumanEval score of 70.1% is especially impressive! I’m looking forward to seeing how the community builds on this model.

That’s sound amazing! 🤩

Tyler Lynch

@AWS Cloud ☁️ Polyglot Solutions Architect, Team Enabler, Enterprise Strategist, Terraform Core Contributor, Networking Automation Engineer.

4mo

The team at Databricks has always set the standard for delivery and execution high. I’m excited to see how this new area of depth helps your customers unlock new potential.

See more comments

To view or add a comment, sign in

Explore topics