"The Bitter Lesson," a famous 2019 blog post, claims that general AI methods using massive compute are the most effective. NVidia's soaring stock price supports this thesis, but is this approach sustainable? What are alternatives?
𝐓𝐡𝐞 𝐎𝐫𝐢𝐠𝐢𝐧𝐚𝐥 "𝐁𝐢𝐭𝐭𝐞𝐫 𝐋𝐞𝐬𝐬𝐨𝐧"
In the original blog post (link in comments), AI pioneer Rich Sutton makes the following observations:
• Over the last 70 years, AI researchers have repeatedly made the mistake of trying to bake human knowledge into AI systems, only to be eventually outperformed by more general methods using brute force compute.
• Prominent examples: custom Chess/Go engines vs. DeepBlue/AlphaZero, edge detectors and SIFT filters vs. ConvNets.
The main reasons are:
• Building in expert knowledge is personally satisfying for the experts and often useful in the short term.
• Researchers tend to think in terms of fixed available compute, when it’s actually increasing daily.
Sutton concludes that we should:
• Focus on general AI methods that can continue to scale, most notably search and learning.
• Stop trying to bake the contents of the human mind into AI systems, as they are too complex, and instead focus on finding meta-methods that can capture this complexity themselves.
𝐑𝐞𝐬𝐩𝐨𝐧𝐬𝐞𝐬
“The Bitter Lesson” triggered numerous responses (see comments). People quickly pointed out that:
• Moore’s law is fading out, so “just putting more GPUs” won’t yield new breakthroughs for much longer.
• The architectures of our most successful deep learning models were actually carefully hand-crafted by humans (ConvNets, Transformers, LSTMs, etc.).
• For general computational problems (e.g., integer factorization), progress based on human understanding was often far greater than progress according to Moore’s Law.
𝐖𝐡𝐞𝐫𝐞 𝐢𝐭 𝐥𝐞𝐚𝐯𝐞𝐬 𝐮𝐬 𝐭𝐨𝐝𝐚𝐲
Companies like OpenAI show that focusing on “more compute” may still lead to massive gains as compute power, despite a waning Moore’s Law, continues to increase several orders of magnitude over the next decades.
However, currently the hype is definitely outperforming Moore's Law (see image below; source link in comments). As a result, AI is at risk of creating a deep environmental footprint and research is increasingly restricted to large corporations that can afford to pay for the compute. That's the bitter lesson of the last year.
As so often, the solution will lie somewhere in the middle.
Geometric deep learning and physics-based approaches are good examples of hybrid approaches. Rather than forcing AI systems to think like humans or blindly explore the solution space using compute, researchers are using symmetries and fundamental insights from physics to set the right inductive biases and priors to then find the best solutions using raw compute.
Let's see where these approaches lead us, and how we think about the "Bitter Lesson" in a decade or more.
#AI #machinelearning #compute
Build Disruptive Technology
1wIt will surpass only the question of time. I expect it within six years. Most people think it is far away, but mostly, they think linearly. However, this evolution is exponential.