Featured Article

The Great Pretender

AI doesn’t know the answer, and it hasn’t learned how to care

Comment

Hands holding a mask of anonymity. Polygonal design of interconnected elements.
Image Credits: llya Lukichev / Getty / Getty Images

There is a good reason not to trust what today’s AI constructs tell you, and it has nothing to do with the fundamental nature of intelligence or humanity, with Wittgensteinian concepts of language representation, or even disinfo in the dataset. All that matters is that these systems do not distinguish between something that is correct and something that looks correct. Once you understand that the AI considers these things more or less interchangeable, everything makes a lot more sense.

Now, I don’t mean to short circuit any of the fascinating and wide-ranging discussions about this happening continually across every form of media and conversation. We have everyone from philosophers and linguists to engineers and hackers to bartenders and firefighters questioning and debating what “intelligence” and “language” truly are, and whether something like ChatGPT possesses them.

This is amazing! And I’ve learned a lot already as some of the smartest people in this space enjoy their moment in the sun, while from the mouths of comparative babes come fresh new perspectives.

But at the same time, it’s a lot to sort through over a beer or coffee when someone asks “what about all this GPT stuff, kind of scary how smart AI is getting, right?” Where do you start — with Aristotle, the mechanical Turk, the perceptron or “Attention is all you need”?

During one of these chats I hit on a simple approach that I’ve found helps people get why these systems can be both really cool and also totally untrustable, while subtracting not at all from their usefulness in some domains and the amazing conversations being had around them. I thought I’d share it in case you find the perspective useful when talking about this with other curious, skeptical people who nevertheless don’t want to hear about vectors or matrices.

There are only three things to understand, which lead to a natural conclusion:

  1. These models are created by having them observe the relationships between words and sentences and so on in an enormous dataset of text, then build their own internal statistical map of how all these millions and millions of words and concepts are associated and correlated. No one has said, this is a noun, this is a verb, this is a recipe, this is a rhetorical device; but these are things that show up naturally in patterns of usage.
  2. These models are not specifically taught how to answer questions, in contrast to the familiar software companies like Google and Apple have been calling AI for the last decade. Those are basically Mad Libs with the blanks leading to APIs: Every question is either accounted for or produces a generic response. With large language models the question is just a series of words like any other.
  3. These models have a fundamental expressive quality of “confidence” in their responses. In a simple example of a cat recognition AI, it would go from 0, meaning completely sure that’s not a cat, to 100, meaning absolutely sure that’s a cat. You can tell it to say “yes, it’s a cat” if it’s at a confidence of 85, or 90, whatever produces your preferred response metric.

So given what we know about how the model works, here’s the crucial question: What is it confident about? It doesn’t know what a cat or a question is, only statistical relationships found between data nodes in a training set. A minor tweak would have the cat detector equally confident the picture showed a cow, or the sky, or a still life painting. The model can’t be confident in its own “knowledge” because it has no way of actually evaluating the content of the data it has been trained on.

The AI is expressing how sure it is that its answer appears correct to the user.

This is true of the cat detector, and it is true of GPT-4 — the difference is a matter of the length and complexity of the output. The AI cannot distinguish between a right and wrong answer — it only can make a prediction of how likely a series of words is to be accepted as correct. That is why it must be considered the world’s most comprehensively informed bullshitter rather than an authority on any subject. It doesn’t even know it’s bullshitting you — it has been trained to produce a response that statistically resembles a correct answer, and it will say anything to improve that resemblance.

The AI doesn’t know the answer to any question, because it doesn’t understand the question. It doesn’t know what questions are. It doesn’t “know” anything! The answer follows the question because, extrapolating from its statistical analysis, that series of words is the most likely to follow the previous series of words. Whether those words refer to real places, people, locations, etc. is not material — only that they are like real ones.

It’s the same reason AI can produce a Monet-like painting that isn’t a Monet — all that matters is it has all the characteristics that cause people to identify a piece of artwork as his. Today’s AI approximates factual responses the way it would approximate “Water Lilies.”

Now, I hasten to add that this isn’t an original or groundbreaking concept — it’s basically another way to explain the stochastic parrot, or the undersea octopus. Those problems were identified very early by very smart people and represent a great reason to read commentary on tech matters widely.

Ethicists fire back at ‘AI Pause’ letter they say ‘ignores the actual harms’

But in the context of today’s chatbot systems, I’ve just found that people intuitively get this approach: The models don’t understand facts or concepts, but relationships between words, and its responses are an “artist’s impression” of an answer. Their goal, when you get down to it, is to fill in the blank convincingly, not correctly. This is the reason why its responses fundamentally cannot be trusted.

Of course sometimes, even a lot of the time, its answer is correct! And that isn’t an accident: For many questions, the answer that looks the most correct is the correct answer. That is what makes these models so powerful — and dangerous. There is so, so much you can extract from a systematic study of millions of words and documents. And unlike recreating “Water Lilies” exactly, there’s a flexibility to language that lets an approximation of a factual response also be factual — but also make a totally or partially invented response appear equally or more so. The only thing the AI cares about is that the answer scans right.

This leaves the door open to discussions around whether this is truly knowledge, what if anything the models “understand,” if they have achieved some form of intelligence, what intelligence even is and so on. Bring on the Wittgenstein!

Furthermore, it also leaves open the possibility of using these tools in situations where truth isn’t really a concern. If you want to generate five variants of an opening paragraph to get around writer’s block, an AI might be indispensable. If you want to make up a story about two endangered animals, or write a sonnet about Pokémon, go for it. As long as it is not crucial that the response reflects reality, a large language model is a willing and able partner — and not coincidentally, that’s where people seem to be having the most fun with it.

Where and when AI gets it wrong is very, very difficult to predict because the models are too large and opaque. Imagine a card catalog the size of a continent, organized and updated over a period of a hundred years by robots, from first principles that they came up with on the fly. You think you can just walk in and understand the system? It gives a right answer to a difficult question and a wrong answer to an easy one. Why? Right now that is one question that neither AI nor its creators can answer.

This may well change in the future, perhaps even the near future. Everything is moving so quickly and unpredictably that nothing is certain. But for the present this is a useful mental model to keep in mind: The AI wants you to believe it and will say anything to improve its chances.

More TechCrunch

Content creators are busy people. Most spend more than 20 hours a week creating new content for their respective corners of the web. That doesn’t leave much time for audience…

Mark Zuckerberg imagines content creators making AI clones of themselves

Elon Musk says he will show off Tesla’s purpose-built “robotaxi” prototype during an event October 10, after scrapping a previous plan to reveal it August 8. Musk said Tesla will…

Elon Musk sets new date for Tesla robotaxi reveal, calls everything beyond autonomy ‘noise’

Alphabet will spend an additional $5 billion on its self-driving subsidiary, Waymo, over the next few years, according to Ruth Porat, the company’s chief financial officer. Porat announced the commitment…

Alphabet to invest another $5B into Waymo

There is no fool proof way to prevent a buggy update like CrowdStrike’s, but there are best practices that could mitigate the fallout.

How to prevent your software update from being the next CrowdStrike

Spotify CEO Daniel Ek says the streaming service is still in the “early days” of its plans to bring hi-fi support to the platform. During the company’s earnings call on…

Spotify CEO says company is in ‘early days’ of hi-fi audio plans

Featured Article

A comprehensive list of 2024 tech layoffs

The tech layoff wave is still going strong in 2024. Following significant workforce reductions in 2022 and 2023, this year has already seen 60,000 job cuts across 254 companies, according to independent layoffs tracker Layoffs.fyi. Companies like Tesla, Amazon, Google, TikTok, Snap and Microsoft have conducted sizable layoffs in the…

A comprehensive list of 2024 tech layoffs

Tesla was not the first company to begin working on a humanoid form factor, but while being the first to market does carry weight in this high-tech space, we’re at…

Elon Musk sets 2026 Optimus sale date. Here’s where other humanoid robots stand.

Harvey, a startup building what it describes as an AI-powered “copilot” for lawyers, has raised $100 million in a Series C round led by GV, Google’s corporate venture arm. The…

OpenAI-backed legal tech startup Harvey raises $100M

Digital banking startup Mercury informed some founders that it is no longer serving customers in certain countries, including Ukraine.

Digital banking startup Mercury abruptly shuttered service for startups in Ukraine, Nigeria, other countries

Welcome to TechCrunch Fintech! This week, we’re looking at Human Interest’s path toward an IPO, fintech’s newest unicorn, a slew of new fundraises, and more. To get a roundup of…

The next fintech to go public may not be the one you expected

Waymo has started testing on public roads in San Francisco a new robotaxi built by Chinese electric automaker Zeekr.  Waymo has “less than a handful” of the Zeekr vehicles in San…

The Waymo-Zeekr robotaxi has come to San Francisco

The transaction values Cyabra at $70 million, and the company expects the merger to close by the end of the year.

Cyabra, a startup helping companies and governments detect disinformation, plans to go public via SPAC

Featured Article

There’s a lot more to the Kamala Harris memes than you think

“You think you just fell out of a coconut tree?” says Vice President Kamala Harris in a now infamous clip. An overlay of the lime green album art for Charli XCX’s “Brat” flashes on the screen, while a remix of “Von Dutch” scores increasingly frenetic clips of Harris hysterically laughing…

There’s a lot more to the Kamala Harris memes than you think

GM’s self-driving car subsidiary Cruise is scrapping plans to build the Origin — a purpose-built robotaxi with no steering wheel or pedals — and will instead use the next-generation Chevrolet Bolt…

GM’s Cruise abandons Origin robotaxi, takes $583 million charge

The Federal Trade Commission announced on Tuesday that it’s ordering eight companies that offer AI-powered “surveillance service pricing” to turn over information about the potential impact these products have on…

FTC is investigating how companies are using AI to base pricing on consumer behavior

Meta AI, Meta’s AI-powered assistant across Facebook, Instagram, Messenger and the web, can now speak in more languages and create stylized selfies. And, starting today, Meta AI users can route…

Meta AI gets new ‘Imagine me’ selfie feature

Mesa, Arizona-based Rosotics has kept a low profile. From the startup’s website, one would think they are solely focused on selling large metal 3D printers to aerospace and defense customers.…

Rosotics wants to manufacture massive orbital shipyards using 3D printing

Meta’s latest open source AI model is its biggest yet. Today, Meta said it is releasing Llama 3.1 405B, a model containing 405 billion parameters. Parameters roughly correspond to a…

Meta releases its biggest ‘open’ AI model yet

Hustle culture is embedded into the Silicon Valley startup ethos, but the expectation to grind all the time can be detrimental to a founder’s mental health. We’re pleased to welcome…

Andy Dunn talks the importance of founder mental health at TechCrunch Disrupt 2024

Meta has been given until September 1 to respond to consumer protection concerns in the European Union. The Consumer Protection Cooperation (CPC) Network, a network of authorities responsible for the…

Meta given weeks to tell EU consumer protection authorities how it’ll fix ‘pay or consent’

Google is no longer proposing to deprecate third-party tracking cookies in Chrome, instead suggesting that users be given an option to deny tracking.

Google’s latest Privacy Sandbox gambit could pit user choice against tracking

Let’s start with the premise that many people take notes as they work with customers as part of their jobs. As they take notes, they may need to access a…

Noded AI wants to make your notes the center of your work world

Nathan Rosenberg, the founder of farm automation platform Farmblox, said if there is one thing to know about trying to sell technology to farmers, it’s that you can’t tell them…

Farmblox puts the control into farmers’ hands with its AI-powered sensor-reading platform

Platforms like TikTok and Spotify have experimented with events on their platforms. But rather than concentrating on concerts and large gatherings, event startup Posh is focusing on intimate gatherings of…

Posh raises $22M to become TikTok for small events

Adobe released new Firefly tools for Photoshop and Illustrator on Tuesday, offering graphic designers more ways to use the company’s in-house AI models. Adobe’s new features let creative workers describe…

Adobe releases new Firefly AI tools for Illustrator and Photoshop

Grocery app Flashfood’s new offering is designed for independently owned grocery stores that want to reduce food waste and consumers who want to save money. 

Flashfood users can now save money on groceries at their local grocery store in addition to bigger chains

Quality assurance in the app development world is a necessary, but often resource-draining, undertaking. According to Statista, 23% of companies’ annual IT budgets are allocated to in-house or third-party contracted…

QA Wolf secures $36M to grow its app QA-testing suite

Level AI offers a suite of AI-powered tools to automate various customer service tasks.

Level AI applies algorithms to contact center pain points

In spite of maintaining stealth until now, Mytra has already drummed up interest with big names. The startup has a pilot with grocery giant Albertsons, among others.

Former Tesla humanoid head launches a robotics startup

An English school has been reprimanded by U.K. regulators after it used facial recognition technology without getting opt-in consent from students.

UK school reprimanded for unlawful use of facial-recognition technology