Director of AI Startups @ Microsoft for Startups | LinkedIn Top AI Voice | 3x Top 10 Women in AI Award Recipient | Keynote Speaker | Startup Advisor | Responsible AI Advocate | EB1A “Einstein Visa” Recipient | x-IBM
There's now an open source solution to slash LLM costs! 🔍 𝐓𝐡𝐞 𝐓𝐋/𝐃𝐑: - LMSYS launched an open source framework "RouteLLM". - It utilizes data from Chatbot Arena and advanced data augmentation techniques and learns how to route queries to the most appropriate model. - It intelligently routes based on query complexity and model capabilities. 📈 𝐈𝐦𝐩𝐫𝐞𝐬𝐬𝐢𝐯𝐞 𝐑𝐞𝐬𝐮𝐥𝐭𝐬: - The team reduced costs by up to 85%, while maintaining 95% of GPT-4’s performance level. - The routers were robust and could handle new model pairs like Claude 3 Opus & Llama 3 8B without retraining. - This performance was on par with commercial products like Martian and Unify AI but at 40% lower costs. 🤔 𝐄𝐱𝐜𝐢𝐭𝐞𝐝 𝐚𝐛𝐨𝐮𝐭 𝐜𝐨𝐬𝐭 𝐞𝐟𝐟𝐢𝐜𝐢𝐞𝐧𝐭 𝐀𝐈? 𝐋𝐞𝐭 𝐦𝐞 𝐤𝐧𝐨𝐰 𝐢𝐟 𝐲𝐨𝐮'𝐯𝐞 𝐜𝐨𝐦𝐞 𝐚𝐜𝐫𝐨𝐬𝐬 𝐚𝐧𝐲 𝐢𝐧𝐧𝐨𝐯𝐚𝐭𝐢𝐯𝐞 𝐬𝐨𝐥𝐮𝐭𝐢𝐨𝐧𝐬 𝐨𝐫 𝐤𝐧𝐨𝐰 𝐬𝐨𝐦𝐞𝐨𝐧𝐞 𝐬𝐨𝐥𝐯𝐢𝐧𝐠 𝐫𝐞𝐚𝐥 𝐩𝐫𝐨𝐛𝐥𝐞𝐦𝐬 𝐢𝐧 𝐭𝐡𝐢𝐬 𝐬𝐩𝐚𝐜𝐞! Link to blog and white paper 👇 -------- 🔔 If you like this, please repost it and share it with anyone who should know this ♻️ and follow me Heena Purohit, for more AI insights and trends. #artificialintelligence #generativeAI #startups #enterpriseAI #AIforBusiness
This is amazing. Now, we don't have to choose/know which model to use for what use case. We have an AI to choose which AI to use.
It's an interesting result, However, as other comments have mentioned, - What test/bench mark was used? - Is it safe to assume that the test used public versions of each model? - In a private environment, the opex costs of hosting all models might be prohibitive - This approach doesn't address the data quality challenges that drive hallucinations A positive step but still many questions remain.
Heena Purohit - very cool! We at Sync Computing are in this space, helping enterprises achieve more cost efficient use of Enterprise AI via Databricks compute optimization. We've demonstrated 60% cost savings for large enterprises!
There’s a saying in software, “Make it work, then make it work better”. RouteLLM can help optimize costs “better” because sometimes you don’t need the best model for a given task. Just a model good enough.
Eventually this spread is going to become tighter and tighter until cost and performance are almost singular, and what matters is the proprietary data backing individual models.
Where are proofs?
Out of date.
Great share but there’s a big question - RouteLLM results were measured against standard benchmarks. But the most performant model for real world scenarios emerges from much trial and error. I wonder how LMSYS solves for routi by based on user preference vs for a standardized metric.
We need to scale a lot by cost optimization with Open source tools
Director of AI Startups @ Microsoft for Startups | LinkedIn Top AI Voice | 3x Top 10 Women in AI Award Recipient | Keynote Speaker | Startup Advisor | Responsible AI Advocate | EB1A “Einstein Visa” Recipient | x-IBM
2wLink to blog post: https://lmsys.org/blog/2024-07-01-routellm/ White paper for more info: https://arxiv.org/abs/2406.18665 Note: The idea of LLM routing isn't new. However these routing solutions were based off of *task-specific* routing, the concept that different models are better at different tasks, which is novel.