SeedLM: Compressing LLM Weights into Seeds of Pseudo-Random Generators
AuthorsRasoul Shafipour, David Harrison, Maxwell Horton, Jeffrey Marker, Houman Bedayat, Sachin Mehta†, Mohammad Rastegari†, Mahyar Najibi, Saman NaderipariziLarge Language Models (LLMs) have transformed natural language processing, but face significant challenges in widespread deployment due to their high runtime cost. In this paper, we introduce SeedLM, a novel post-training compression method that uses seeds of a pseudo-random generator to encode and compress model weights. Specifically, for ...
Read more at machinelearning.apple.com