AI

OpenAI debuts GPT-4o ‘omni’ model now powering ChatGPT

Comment

Image Credits: OpenAI

OpenAI announced a new flagship generative AI model on Monday that they call GPT-4o — the “o” stands for “omni,” referring to the model’s ability to handle text, speech, and video. GPT-4o is set to roll out “iteratively” across the company’s developer and consumer-facing products over the next few weeks.

OpenAI CTO Mira Murati said that GPT-4o provides “GPT-4-level” intelligence but improves on GPT-4’s capabilities across multiple modalities and media.

“GPT-4o reasons across voice, text and vision,” Murati said during a streamed presentation at OpenAI’s offices in San Francisco on Monday. “And this is incredibly important, because we’re looking at the future of interaction between ourselves and machines.”

GPT-4 Turbo, OpenAI’s previous “leading “most advanced” model, was trained on a combination of images and text and could analyze images and text to accomplish tasks like extracting text from images or even describing the content of those images. But GPT-4o adds speech to the mix.

What does this enable? A variety of things. 

Image Credits: OpenAI

GPT-4o greatly improves the experience in OpenAI’s AI-powered chatbot, ChatGPT. The platform has long offered a voice mode that transcribes the chatbot’s responses using a text-to-speech model, but GPT-4o supercharges this, allowing users to interact with ChatGPT more like an assistant. 

For example, users can ask the GPT-4o-powered ChatGPT a question and interrupt ChatGPT while it’s answering. The model delivers “real-time” responsiveness, OpenAI says, and can even pick up on nuances in a user’s voice, in response generating voices in “a range of different emotive styles” (including singing). 

GPT-4o also upgrades ChatGPT’s vision capabilities. Given a photo — or a desktop screen — ChatGPT can now quickly answer related questions, from topics ranging from “What’s going on in this software code?” to “What brand of shirt is this person wearing?”

ChatGPT’s desktop app in use in a coding task.
Image Credits: OpenAI

These features will evolve further in the future, Murati says. While today GPT-4o can look at a picture of a menu in a different language and translate it, in the future, the model could allow ChatGPT to, for instance, “watch” a live sports game and explain the rules to you.

“We know that these models are getting more and more complex, but we want the experience of interaction to actually become more natural, easy, and for you not to focus on the UI at all, but just focus on the collaboration with ChatGPT,” Murati said. “For the past couple of years, we’ve been very focused on improving the intelligence of these models … But this is the first time that we are really making a huge step forward when it comes to the ease of use.”

GPT-4o is more multilingual as well, OpenAI claims, with enhanced performance in around 50 languages. And in OpenAI’s API and Microsoft’s Azure OpenAI Service, GPT-4o is twice as fast as, half the price of and has higher rate limits than GPT-4 Turbo, the company says.

At present, voice isn’t a part of the GPT-4o API for all customers. OpenAI, citing the risk of misuse, says that it plans to first launch support for GPT-4o’s new audio capabilities to “a small group of trusted partners” in the coming weeks.

GPT-4o is available in the free tier of ChatGPT starting today and to subscribers to OpenAI’s premium ChatGPT Plus and Team plans with “5x higher” message limits. (OpenAI notes that ChatGPT will automatically switch to GPT-3.5, an older and less capable model, when users hit the rate limit.) The improved ChatGPT voice experience underpinned by GPT-4o will arrive in alpha for Plus users in the next month or so, alongside enterprise-focused options.

In related news, OpenAI announced that it’s releasing a refreshed ChatGPT UI on the web with a new, “more conversational” home screen and message layout, and a desktop version of ChatGPT for macOS that lets users ask questions via a keyboard shortcut or take and discuss screenshots. ChatGPT Plus users will get access to the app first, starting today, and a Windows version will arrive later in the year.

Elsewhere, the GPT Store, OpenAI’s library of and creation tools for third-party chatbots built on its AI models, is now available to users of ChatGPT’s free tier. And free users can take advantage of ChatGPT features that were formerly paywalled, like a memory capability that allows ChatGPT to “remember” preferences for future interactions, upload files and photos, and search the web for answers to timely questions.

We’re launching an AI newsletter! Sign up here to start receiving it in your inboxes on June 5.

Read more about OpenAI's Spring Event on TechCrunch

More TechCrunch

Vanta, a trust management platform that helps businesses automate much of their security and compliance processes, today announced that it has raised a $150 million Series C funding round led…

Vanta trust management platform raises $150M Series C, now valued at $2.45B

It’s not often you’ll find Microsoft, Amazon, and Meta in the same room collaborating toward the same goals. But that’s exactly what we have with the Overture Maps Foundation, an…

Backed by Microsoft, AWS, and Meta, the Overture Maps Foundation launches first open map datasets

The startup is not disclosing its valuation, but sources close to the company say the figure is just under $400 million post-money.

Dazz snaps up $50M for AI-based, automated cloud security remediation

The outcome of the Spanish authority’s probe could take up to two years to complete, and leave Apple on the hook for fines in the billions.

Apple’s App Store hit with antitrust probe in Spain

Proton’s first cryptocurrency product is a wallet called Proton Wallet that’s designed to make it easier to get started with bitcoin.

Proton releases a self-custody bitcoin wallet

Dental care is a necessity, yet many patients lack confidence in their dentists’ ability to provide accurate diagnoses and appropriate treatments. Some dentists over treat patients, leading to unnecessary expenses,…

Pearl raises $58M to help dentists make better diagnoses using AI 

Exoticca’s platform connects flights, hotels, meals, transfers, transportation and more, plus the local companies at the destinations.

Spanish startup Exoticca raises a €60M Series D for its tour packages platform

Content creators are busy people. Most spend more than 20 hours a week creating new content for their respective corners of the web. That doesn’t leave much time for audience…

Mark Zuckerberg imagines content creators making AI clones of themselves

Elon Musk says he will show off Tesla’s purpose-built “robotaxi” prototype during an event October 10, after scrapping a previous plan to reveal it August 8. Musk said Tesla will…

Elon Musk sets new date for Tesla robotaxi reveal, calls everything beyond autonomy ‘noise’

Alphabet will spend an additional $5 billion on its self-driving subsidiary, Waymo, over the next few years, according to Ruth Porat, the company’s chief financial officer. Porat announced the commitment…

Alphabet to invest another $5B into Waymo

There is no fool proof way to prevent a buggy update like CrowdStrike’s, but there are best practices that could mitigate the fallout.

How to prevent your software update from being the next CrowdStrike

Spotify CEO Daniel Ek says the streaming service is still in the “early days” of its plans to bring hi-fi support to the platform. During the company’s earnings call on…

Spotify CEO says company is in ‘early days’ of hi-fi audio plans

Featured Article

A comprehensive list of 2024 tech layoffs

The tech layoff wave is still going strong in 2024. Following significant workforce reductions in 2022 and 2023, this year has already seen 60,000 job cuts across 254 companies, according to independent layoffs tracker Layoffs.fyi. Companies like Tesla, Amazon, Google, TikTok, Snap and Microsoft have conducted sizable layoffs in the…

A comprehensive list of 2024 tech layoffs

Tesla was not the first company to begin working on a humanoid form factor, but while being the first to market does carry weight in this high-tech space, we’re at…

Elon Musk sets 2026 Optimus sale date. Here’s where other humanoid robots stand.

Harvey, a startup building what it describes as an AI-powered “copilot” for lawyers, has raised $100 million in a Series C round led by GV, Google’s corporate venture arm. The…

OpenAI-backed legal tech startup Harvey raises $100M

Digital banking startup Mercury informed some founders that it is no longer serving customers in certain countries, including Ukraine.

Digital banking startup Mercury abruptly shuttered service for startups in Ukraine, Nigeria, other countries

Welcome to TechCrunch Fintech! This week, we’re looking at Human Interest’s path toward an IPO, fintech’s newest unicorn, a slew of new fundraises, and more. To get a roundup of…

The next fintech to go public may not be the one you expected

Waymo has started testing on public roads in San Francisco a new robotaxi built by Chinese electric automaker Zeekr.  Waymo has “less than a handful” of the Zeekr vehicles in San…

The Waymo-Zeekr robotaxi has come to San Francisco

The transaction values Cyabra at $70 million, and the company expects the merger to close by the end of the year.

Cyabra, a startup helping companies and governments detect disinformation, plans to go public via SPAC

Featured Article

There’s a lot more to the Kamala Harris memes than you think

“You think you just fell out of a coconut tree?” says Vice President Kamala Harris in a now infamous clip. An overlay of the lime green album art for Charli XCX’s “Brat” flashes on the screen, while a remix of “Von Dutch” scores increasingly frenetic clips of Harris hysterically laughing…

There’s a lot more to the Kamala Harris memes than you think

GM’s self-driving car subsidiary Cruise is scrapping plans to build the Origin — a purpose-built robotaxi with no steering wheel or pedals — and will instead use the next-generation Chevrolet Bolt…

GM’s Cruise abandons Origin robotaxi, takes $583 million charge

The Federal Trade Commission announced on Tuesday that it’s ordering eight companies that offer AI-powered “surveillance service pricing” to turn over information about the potential impact these products have on…

FTC is investigating how companies are using AI to base pricing on consumer behavior

Meta AI, Meta’s AI-powered assistant across Facebook, Instagram, Messenger and the web, can now speak in more languages and create stylized selfies. And, starting today, Meta AI users can route…

Meta AI gets new ‘Imagine me’ selfie feature

Mesa, Arizona-based Rosotics has kept a low profile. From the startup’s website, one would think they are solely focused on selling large metal 3D printers to aerospace and defense customers.…

Rosotics wants to manufacture massive orbital shipyards using 3D printing

Meta’s latest open source AI model is its biggest yet. Today, Meta said it is releasing Llama 3.1 405B, a model containing 405 billion parameters. Parameters roughly correspond to a…

Meta releases its biggest ‘open’ AI model yet

Hustle culture is embedded into the Silicon Valley startup ethos, but the expectation to grind all the time can be detrimental to a founder’s mental health. We’re pleased to welcome…

Andy Dunn talks the importance of founder mental health at TechCrunch Disrupt 2024

Meta has been given until September 1 to respond to consumer protection concerns in the European Union. The Consumer Protection Cooperation (CPC) Network, a network of authorities responsible for the…

Meta given weeks to tell EU consumer protection authorities how it’ll fix ‘pay or consent’

Google is no longer proposing to deprecate third-party tracking cookies in Chrome, instead suggesting that users be given an option to deny tracking.

Google’s latest Privacy Sandbox gambit could pit user choice against tracking

Let’s start with the premise that many people take notes as they work with customers as part of their jobs. As they take notes, they may need to access a…

Noded AI wants to make your notes the center of your work world

Nathan Rosenberg, the founder of farm automation platform Farmblox, said if there is one thing to know about trying to sell technology to farmers, it’s that you can’t tell them…

Farmblox puts the control into farmers’ hands with its AI-powered sensor-reading platform