"Researchers Develop Cost-Efficient Methods for Running Large Language Models over Internet, Utilizing Idle Compute Resources and Special Fault-Tolerant Algorithms"

Distributed Inference and Fine-tuning of Large Language Models Over The Internet

Alexander Borzunov HSE Univesity, Yandex &Max Ryabinin HSE Univesity, Yandex &Artem Chumachenko Neiro.ai Dmitry Baranchuk Yandex &Tim Dettmers University of Washington &Younes Belkada Hugging Face Pavel Samygin Yandex School of Data Analysis &Colin Raffel Hugging Face Abstract Large language models (LLMs) are useful in many NLP tasks and become more capable with size, with the best open-source models having over 50 billion parameters. However, using these 50B+ models requires high-end hardware, ...