Former Meta scientists unveiled an AI model that can generate new proteins

EvolutionaryScale has raised $142 million for its protein language model, ESM3

We may earn a commission from links on this page.
gloved hands holding a petri dish in front of a monitor displaying structural models of proteins
Structural model of proteins displayed on a monitor.
Illustration: nicolas_ (Getty Images)

Former Meta scientists have unveiled a model that can generate proteins the way chatbots can generate words.

EvolutionaryScale, founded by a team of ex-Meta researchers who were working on artificial intelligence models for biology, announced its protein language model, ESM3, in June, which it said is “the first generative model for biology that simultaneously reasons over the sequence, structure, and function of proteins.” ESM3 is trained on the sequence, structure, and function of over 2.7 billion proteins and can generate new proteins by following prompts.

“We want to build tools that can make biology programmable,” Alexander Rives, chief scientist at EvolutionaryScale and former lead of Meta’s “AI protein team,” told Nature.

In June, EvolutionaryScale announced it had raised $142 million in a seed round, counting Lux Capital, Amazon Web Services, Nat Friedman, Daniel Gross, and Nvidia venture capital arm NVentures as investors. Lux Capital co-founder and managing partner Josh Wolfe told Reuters the company is a “ChatGPT moment for biology.”

At Meta, Rives and his team created a database of over 600 million protein structures that could be used for drug development. There, they had developed earlier versions of the ESM model, including ESMFold, which trained a large language model on biological data to generate predictions of protein structures. The team was cut in 2023 during what Meta chief executive Mark Zuckerberg called the “year of efficiency” to focus on commercial AI products.

Advertisement

The company also announced a new paper, in preview, showing it had prompted ESM3 to generate new green fluorescent proteins, or GFPs, with a sequence “that is only 58% similar to the closest known fluorescent protein,” or the proteins responsible for how jellyfish and coral glow. “From the rate of diversification of GFPs found in nature, we estimate that this generation of a new fluorescent protein is equivalent to simulating over 500 million years of evolution,” EvolutionaryScale said.