Tag: memory requirements
-
Hacker News: VPTQ: Extreme low-bit Quantization for real LLMs
Source URL: https://github.com/microsoft/VPTQ Source: Hacker News Title: VPTQ: Extreme low-bit Quantization for real LLMs Feedly Summary: Comments AI Summary and Description: Yes **Summary:** The text discusses a novel technique called Vector Post-Training Quantization (VPTQ) designed for compressing Large Language Models (LLMs) to extremely low bit-widths (under 2 bits) without compromising accuracy. This innovative method can…
-
Simon Willison’s Weblog: Quoting Magic AI
Source URL: https://simonwillison.net/2024/Aug/30/magic-ai/#atom-everything Source: Simon Willison’s Weblog Title: Quoting Magic AI Feedly Summary: We have recently trained our first 100M token context model: LTM-2-mini. 100M tokens equals ~10 million lines of code or ~750 novels. For each decoded token, LTM-2-mini’s sequence-dimension algorithm is roughly 1000x cheaper than the attention mechanism in Llama 3.1 405B for…