Charlie Lee, PhD’s Post

View profile for Charlie Lee, PhD, graphic

Genomics Industry Lead | Experienced R&D Leader

https://lnkd.in/gcgHTFhg Matrix multiplications (MatMul) are the most computationally expensive operations in large language models (LLM) using the Transformer architecture. As LLMs scale to larger sizes, the cost of MatMul grows significantly, increasing memory usage and latency during training and inference. Now, researchers at the University of California, Santa Cruz, Soochow University and University of California, Davis have developed a novel architecture that completely eliminates matrix multiplications from language models while maintaining strong performance at large scales.

New Transformer architecture could enable powerful LLMs without GPUs

New Transformer architecture could enable powerful LLMs without GPUs

https://venturebeat.com

To view or add a comment, sign in

Explore topics