Apple Researchers Unveil SeedLM: Novel Compression Method Shrinks LLMs, Boosts Speed Without Sacrificing Accuracy

SeedLM: Compressing LLM Weights into Seeds of Pseudo-Random Generators

AuthorsRasoul Shafipour, David Harrison, Maxwell Horton, Jeffrey Marker, Houman Bedayat, Sachin Mehta†, Mohammad Rastegari†, Mahyar Najibi, Saman NaderipariziLarge Language Models (LLMs) have transformed natural language processing, but face significant challenges in widespread deployment due to their high runtime cost. In this paper, we introduce SeedLM, a novel post-training compression method that uses seeds of a pseudo-random generator to encode and compress model weights. Specifically, for ...