Tag: Vector Post-Training Quantization

  • Hacker News: VPTQ: Extreme low-bit Quantization for real LLMs

    Source URL: https://github.com/microsoft/VPTQ Source: Hacker News Title: VPTQ: Extreme low-bit Quantization for real LLMs Feedly Summary: Comments AI Summary and Description: Yes **Summary:** The text discusses a novel technique called Vector Post-Training Quantization (VPTQ) designed for compressing Large Language Models (LLMs) to extremely low bit-widths (under 2 bits) without compromising accuracy. This innovative method can…