News Score: Score the News, Sort the News, Rewrite the Headlines

Making AMD GPUs competitive for LLM inference

Aug 9, 2023 • TL;DR MLC-LLM makes it possible to compile LLMs and deploy them on AMD GPUs using ROCm with competitive performance. More specifically, AMD Radeon™ RX 7900 XTX gives 80% of the speed of NVIDIA® GeForce RTX™ 4090 and 94% of the speed of NVIDIA® GeForce RTX™ 3090Ti for Llama2-7B/13B. Besides ROCm, our Vulkan support allows us to generalize LLM deployment to other AMD devices, for example, a SteamDeck with an AMD APU. Background There have been many LLM inference solutions since the b...

Read more at blog.mlc.ai

© News Score  score the news, sort the news, rewrite the headlines