Cerebras Systems' AI Inference Performance Leaps 3.5X with Llama 3.2 Model, Outpacing GPUs and Rivals

Cerebras Trains Llama Models To Leap Over GPUs

It was only a few months ago when waferscale compute pioneer Cerebras Systems was bragging that a handful of its WSE-3 engines lashed together could run circles around Nvidia GPU instances based on Nvidia’s “Hopper” H100 GPUs when running the open source Llama 3.1 foundation model created by Meta Platforms. And now, as always happens when software engineers finally catch up with hardware features, Cerebras is back bragging again saying that its performance advantage for inference is even larger ...