Models from AI labs headquartered in China 🇨🇳 are now competitive with the leading models globally 🌎 Qwen 2 72B from Alibaba Cloud has the highest MMLU score of open-source models, and Yi Large from 01.AI and Deepseek v2 from DeepseekAI are amongst the highest quality models and are priced very competitively. We have initiated coverage of these on Artificial Analysis. Previously models from AI labs with an HQ in China were generally not competitive globally with models from leading AI labs globally. They also had issues being multilingual, likely due to their Chinese focused training data set, and in-cases output Chinese characters in response to English prompts. This has changed over the past couple of months with new models released which benchmark amongst the leading models globally. These labs have achieved this using similar techniques to labs globally, particularly training the models on many times more tokens than is Chinchilla optimal, training larger models, using techniques like Mixture of Experts and improving training data quality (including through extensive use of synthetic & LLM-refined data). The labs are also increasing their marketing to global audiences, as shown by Yi Large being accessible on Fireworks AI. While Qwen 2 72B has the highest MMLU score of open-source models, it is important to note that Meta has announced they are to release Llama 3 405B shortly and this is likely to far exceed capabilities of all open source models available today. We have commenced benchmarking of these models on Artificial Analysis. Link to analysis: https://lnkd.in/g4bbqEre
Artificial Analysis’ Post
More Relevant Posts
-
Today OpenAI banned access to its API from China. China has already blocked access to ChatGPT using the 'Great Firewall'. In the past few months, we have seen models from AI labs HQ'd in China start to 'catch-up' to the quality of models developed globally. These geo-restrictions will support demand for models developed from AI labs with an HQ in China. We may also see an acceleration in AI development to support this demand. 01.AI, Deepseek, Alibaba Cloud, SenseTime 商汤科技, Baidu, Inc. are key companies to watch in this space. For those using AI, this adds another consideration when choosing models. APIs may not be accessible from everywhere and we could potentially see further restrictions (e.g. use of the output of LLMs). We will look to provide information on this on Artificial Analysis to support users choosing technologies.
Models from AI labs headquartered in China 🇨🇳 are now competitive with the leading models globally 🌎 Qwen 2 72B from Alibaba Cloud has the highest MMLU score of open-source models, and Yi Large from 01.AI and Deepseek v2 from DeepseekAI are amongst the highest quality models and are priced very competitively. We have initiated coverage of these on Artificial Analysis. Previously models from AI labs with an HQ in China were generally not competitive globally with models from leading AI labs globally. They also had issues being multilingual, likely due to their Chinese focused training data set, and in-cases output Chinese characters in response to English prompts. This has changed over the past couple of months with new models released which benchmark amongst the leading models globally. These labs have achieved this using similar techniques to labs globally, particularly training the models on many times more tokens than is Chinchilla optimal, training larger models, using techniques like Mixture of Experts and improving training data quality (including through extensive use of synthetic & LLM-refined data). The labs are also increasing their marketing to global audiences, as shown by Yi Large being accessible on Fireworks AI. While Qwen 2 72B has the highest MMLU score of open-source models, it is important to note that Meta has announced they are to release Llama 3 405B shortly and this is likely to far exceed capabilities of all open source models available today. We have commenced benchmarking of these models on Artificial Analysis. Link to analysis: https://lnkd.in/g4bbqEre
To view or add a comment, sign in
-
-
MSRA: VASA-1. This diffusion-based #AI model, developed by Microsoft, has just been released and given its hyperrealism it is pretty remarkable. I find it especially interesting because of how faces can be accurately reflected in 3D. Upon further analysis, however, there is more to the development of AI that this model teaches us. Overlook. In the Western world we often see sensationalistic profit-driven releases of technology without adequate safety measure implementations (e.g. ChatGPT-3). Meanwhile, institutions like The Stanford Institute for Human-Centered Artificial Intelligence (HAI) publish reports that are US-centric and overlook major global platforms WeChat, TikTok, SHEIN or Temu – like most downloaded and addictive apps on the planet, that are also accumulating vast amounts of data, are yet to develop advanced computer vision, voice recognition and deep learning based content recommendation algorithms. This defies common sense and, at face value, suggests that AI development in China may be purposefully kept under the radar. Microsoft's Influence. To add credibility to this notion, VASA-1 has now been conveniently released by a major US tech company. However, their MSRA, which developed it, stands for Microsoft Research Asia and is a top AI lab based in Beijing. This release therefore highlights Microsoft's significant contributions and the advanced AI development in China, and may as well be why the company is able to operate there successfully, considering this FT article: – Founded by Taiwanese computer scientist Lee Kai-Fu, MSRA has been an important training centre for Chinese tech talent. Its star-studded alumni list includes Alibaba chief technology officer Wang Jian, SenseTime chief Xu Li, and Yin Qi, head of the AI group Megvii. “MSRA’s contribution to AI has been phenomenal,” said one tech consultant in China who has previously worked with Microsoft. “It has been working in the field for a long time. Many ex-colleagues have joined Chinese tech companies and boosted the overall AI ecosystem in China.” Microsoft has been in China for more than three decades. It has retained a strong presence in the country, even as other Western tech groups, including Google, eBay, Facebook and Uber, have been forced out by competition or regulation. ➤ https://lnkd.in/e2cWKSNs – Responsibility. Although there are certainly ethical concerns associated with releasing technology capable of easily producing problematic deepfakes, it is reassuring to see that the VoxCeleb 2 Dataset used for training the model is publicly available and sourced responsibly. In general I believe that with deep AI innovation occurring behind the scenes in China, and quantum artificial intelligence gradually coming into view, we are set to witness even more responsibly developed technological breakthroughs, somewhat similar to those of VASA-1. –– ➤ VASA-1 Research paper: https://lnkd.in/e8DAq5VV ➤ HAI AI Index Report 2024: https://lnkd.in/eqFUhGsa
To view or add a comment, sign in
-
In my opinion ,SLM based AI fits Japan as it is occupied with so many SME companies who may not be able to invest an innovation due to its high-cost H/W and S/W.
To view or add a comment, sign in
-
"It is not possible to produce an AI system that is not biased, not because of technological challenges, it’s because bias is in the eye of the beholder. The solution is by making it open source, free, and diverse." A wonderful conversation with Yann LeCun, the Chief AI Scientist at Meta, professor at NYU, and one of the most influential researchers in the history of AI. Here are some of the insights: - Autoregressive LLMs are not the way we’re going to make progress towards superhuman intelligence for various reasons, including the lack of essential characteristics such as understanding the physical world, persistent memory, reasoning, and planning, autoregressive LLMs cannot achieve human-level intelligence despite their utility and potential for applications. - LLMs face limitations due to autoregressive prediction, as each token generated carries a probability of leading the model away from reasonable answers, with this risk compounding exponentially for each subsequent token produced. - It is not possible to produce an AI system that is not biased not because of technological challenges, although they are technological challenges to that, it’s because bias is in the eye of the beholder, so the answer is the same answer that we found in liberal democracy about the press, the press needs to be free and diverse. - And if some of those top systems are open source, anybody can use them, anybody can fine tune them. If we put in place some systems that allows any group of people, whether they are individual citizens, groups of citizens, government organizations, NGOs, companies, whatever, to take those open source AI systems and fine tune them for their own purpose on their own data, then we’re going to have a very large diversity of different AI systems that are specialized for all of those things. - Achieving Artificial General Intelligence (AGI) won't be a sudden breakthrough but rather a gradual process involving advancements in learning from video, memory capabilities, reasoning, planning, and hierarchical representations. it will likely take at least a decade or more due to unforeseen challenges. - LLaMA 3 will include improvements in size, multimodality, and training capabilities, particularly focusing on systems capable of understanding the world, planning, reasoning, and learning from video.
To view or add a comment, sign in
-
-
Full Stack B2B Technology Product Marketing Professional | Growth Marketing Strategist | Mentor, Coach & Advisor
NOT READY FOR PRIME TIME: ARE AI TECHNOLOGIES MATURE YET? Everywhere we look, it seems that #AI tools are taking over. Every day there are more and more new tools designed to help us process data, develop content, draw images, do basic research, and more. But if you take a look at the output generated by those AI tools, would you say that the output is a quality product? Is it something that you could actually use without substantial reworking? No. If you don't believe me, take a look at some of the AI-generated #collaborativelearning articles that are posted right here on LinkedIn. The quality of those articles is atrocious. The writing contained in most of those articles is perhaps comparable to something a precocious third-grader might create. While technically the topic and sentences may be understandable, the way the information is communicated isn't clear, the phrasing is weak, the sentences are very stilted, and there is a definite lack of humanity in the sentence structures. That's just one way you can tell that the LI collaborative AI is still very much in its infancy. The reality is that AI tools are still learning. They're still growing. Yes, they seem to be doing it faster and faster, but they're still not where they really need to be yet. Or where they will get to be. Essentially, AI learns through an iterative process using large language models (LLMs). Think of an #LLM as a massive data repository that collects data from across the Internet, online sources such as libraries and universities, government websites, as well as social platforms, business websites, and numerous other sources. Algorithms are then used to analyze all of these various data sources to teach the LLM how to categorize the data, structure the data, and organize the data, which the AI then uses to respond to queries. As the LLMs improve, the answers to queries improve over time (this is essentially the iterative process). It does this by the LLM assigning probabilities to the words used in the response, the order the words are placed in, and many, many more parameters that are assigned (this is an extremely simplified explanation). After a response is generated, it then takes feedback from the user as to whether the answer to the user's query was correct or not and then makes adjustments. This type of iterative learning process takes time, but it slowly improves the output. That's why right now AI is in what I call the "toy stage." This is the stage where researchers and individuals all around the globe are playing with AI to see what it can/cannot do and what the existing and future capabilities are. That's why AI is still not quite ready for prime time. Yes, it's getting better, and will continue to do so, but I'd suggest that AI still has quite a ways to go before it really becomes a tool that can truly be relied upon and used on a consistent basis. Thoughts? #AI #ailearning #aiadoption #aitools #aitoolsforbusiness #largelanguagemodel #LLM
To view or add a comment, sign in
-
-
🌐💡 Facing an unprecedented challenge, AI companies are on the verge of exhausting the entire internet's data for training advanced models! From exploring synthetic data to tapping into unorthodox data sources, the quest for innovative solutions is on. But as debates over sustainability and ethical implications intensify, the industry stands at a crossroads. Can we pave a path towards more efficient, responsible AI development, or will we witness a shift in the quest for 'bigger and better'? Dive into the details and join the conversation here https://lnkd.in/gx2n6mzd #AIEthics #DataCrisis #SustainableAI #TechInnovation"
AI Companies Running Out of Training Data After Burning Through Entire Internet
futurism.com
To view or add a comment, sign in
-
**China Approves Over 40 AI Models for Public Use in 6 Months** China has approved over 40 artificial intelligence models for public use in the first six months since the approval process began, according to Reuters. This marks a significant milestone in the country's AI development, indicating the government's commitment to fostering innovation in this sector. **Approvals for LLMs and Generative AI Services** The Chinese regulators granted approvals to 14 large language models (LLMs) for public use, signaling a focus on the development and deployment of advanced AI technologies. Additionally, the Cyberspace Administration of China approved the first batch of generative AI services for public launch, opening up opportunities for domestic players to compete with established LLMs. **Diverse Range of Companies and AI Applications** The approved AI models showcase a diverse range of companies, including tech giants like Xiaomi, Baidu, Alibaba, and ByteDance, as well as startups like Zhipu and Baichuan. Furthermore, the report highlights the growing interest and investment in generative AI services, with companies globally launching their own LLMs to offer various content, image, and voice generation services.
China's AI Revolution: Over 40 Models Approved in Just 6 Months!
growmybag.tv
To view or add a comment, sign in
-
AI remains fundamentally reliant on data. As AI models, especially LLMs, trained on publicly accessible data approach performance gain plateau, the emphasis is shifting towards obtaining specialized, high-quality datasets to maintain a competitive edge. Companies are increasingly striking deals to license content data from publishers or potentially acquire entire companies for their valuable data assets.
The AI arms race may soon center on a competition for ‘expert’ data
fastcompany.com
To view or add a comment, sign in
-
Advising leaders on the Human Elements of Business Technology, Market Strategy, Innovation, Marketing & Commercial Excellence and Change Management 🔮 Trends & Art Expert, Keynote Speaker 💡 Editor & Podcaster
⭐ CIO UPDATE: Google's latest AI tool, Gemini, presents a range of exciting business opportunities, particularly for those seeking innovative solutions in artificial intelligence. Gemini stands out for its ability to process and understand various types of data, including text, code, audio, images, and video. Gemini comes in different variations, each tailored for specific use cases and complexity levels. One of the key aspects of Gemini is its creative capabilities, making it particularly powerful for creative tasks, as it can produce unique content in various formats, including text, images, audio, and more. Gemini's capabilities extend to multimodal question answering, summarization, translation, content generation, and reasoning. Its versatility allows it to handle complex tasks and make informed assumptions and conclusions, making it a valuable tool for problem-solving and decision-making tasks. For businesses, Gemini's advanced capabilities could be a game-changer in various sectors like healthcare, finance, logistics, and creative industries. Its integration with Google's vast array of products and services, including Google Cloud, Gmail, Google Workspace, and hardware devices, further enhances its applicability and accessibility. In terms of how it compares to other AI models like OpenAI's GPT-4, Gemini's multimodal nature makes it more versatile for handling a wider range of tasks and data types. Its approach of combining various AI models into a cohesive network allows for more advanced and integrated AI solutions. Stay ahead of the curve and explore the possibilities with Gemini.
To view or add a comment, sign in
-
GPT-4 costs $78M to develop, but that's just the tip of the iceberg 📈 This infographic is based on Stanford University's 2024 Artificial Intelligence Index Report. It highlights the substantial cost increase of building cutting-edge AI models such as OpenAI's GPT-4 and Google's Gemini Ultra. For example, the development cost of GPT-4 has climbed to an estimated $78 million, while Gemini Ultra's expenses have soared to a stunning $191 million. Anthropic CEO Dario Amodei expects the cost of training these models to reach $10 billion or even $100 billion within the next three years. The study also reveals that chips and staff are the most significant cost factors, amounting to tens of millions of dollars. With the projection that training costs could surpass a billion dollars by 2027, only well-funded organizations can afford such investments. David Cahn's article "AI's $600B Question" analyzes the widening disparity between the capital spending necessary for AI infrastructure and its income. The latest report shows OpenAI's revenue is $3.4 billion, up from $1.6 billion in late 2023. This growth represents a substantial portion of the total AI revenue. So, revenue from other AI applications remains limited, with few AI products achieving significant consumer adoption or financial success.
To view or add a comment, sign in
-