Lorenzo Thione’s Post

View profile for Lorenzo Thione, graphic

Public Speaker & Investor in Artificial Intelligence / Broadway Producer / 🏳️🌈 Advocate

Groq’s demo is computing proof that the chip race for better AI/inference is only starting. Generally, there is going to be a lot of really interesting chip-side innovation aimed at putting more computation at the edges. Groq’s chips are designed specifically for working with LLMs—they’re “LPUs” (language processing units)—and they can run Mixtral 8x7B at almost 500 tokens per second—by comparison, GPT-3.5T is 100 tokens/s. This, coming from a 200-person company—a massive feat when compared to huge players like NVIDIA. And NVIDIA is pushing forward as well, launching Chat with RTX, a model that runs locally on your PC, allowing it to ingest all of the data that’s stored on your computer, just last week. These are both super cool releases that will help make LLMs more usable and more pervasive. Between inference at the edge, privacy and locality, and more speed/less cost for compute, the revolution is apace. I expect that we’ll see even more advancement—and, likely, specialization—coming from these and other chipmakers in the future, including from Apple, which is rumored to have several AI-first chips ready to release. #Gaingels #AI #ArtificialIntelligence #NVIDIA #Groq

📹 Aaron Jones

Co-Founder @ Yepic AI | Edge-Based Generative Video Chatbots

4mo

The edge really the only future we have. Making models more accessible, efficient and privacy focused will unlock a tsunami of new use cases like the app store did at apple.

Like
Reply
Francesco Cracolici

Posting about tech investments🤑, startup growth hacks 👽(mostly emerging markets) 🌎

4mo

Fr, how did you learn all this stuff?😍🤣

Like
Reply
See more comments

To view or add a comment, sign in

Explore topics