GitHub - microsoft/VPTQ: VPTQ, A Flexible and Extreme low-bit quantization algorithm
VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models
TL;DR
News
Installation
Dependencies
Install VPTQ on your machine
Evaluation
Models from Open Source Community
Language Generation Example
Terminal Chatbot Example
Python API Example
Gradio Web App Example
Tech Report
Early Results from Tech Report
Road Map
Project main members:
Acknowledgement
Publication
Star History
Limitation of VPTQ
Contributing
Trademarks
TL;DR
Vector Post-Training Quantization (VPTQ) is a no...
Read more at github.com