post-training quantization - Cloud Security Alliance News Clipping Site

Hacker News: SVDQuant: 4-Bit Quantization Powers 12B Flux on a 16GB 4090 GPU with 3x Speedup

Nov 9, 2024

—

by

Source URL: https://hanlab.mit.edu/blog/svdquant Source: Hacker News Title: SVDQuant: 4-Bit Quantization Powers 12B Flux on a 16GB 4090 GPU with 3x Speedup Feedly Summary: Comments AI Summary and Description: Yes **Summary:** The provided text discusses the innovative SVDQuant paradigm for post-training quantization of diffusion models, which enhances computational efficiency by quantizing both weights and activations to…

Hacker News: VPTQ: Extreme low-bit Quantization for real LLMs

Oct 21, 2024

—

by

system automation

in Uncategorized

Source URL: https://github.com/microsoft/VPTQ Source: Hacker News Title: VPTQ: Extreme low-bit Quantization for real LLMs Feedly Summary: Comments AI Summary and Description: Yes **Summary:** The text discusses a novel technique called Vector Post-Training Quantization (VPTQ) designed for compressing Large Language Models (LLMs) to extremely low bit-widths (under 2 bits) without compromising accuracy. This innovative method can…

Hacker News: LlamaF: An Efficient Llama2 Architecture Accelerator on Embedded FPGAs

Sep 27, 2024

—

by

system automation

in Uncategorized

Source URL: https://arxiv.org/abs/2409.11424 Source: Hacker News Title: LlamaF: An Efficient Llama2 Architecture Accelerator on Embedded FPGAs Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses a novel approach to enhancing the inference performance of large language models (LLMs) on embedded FPGA devices. It provides insights into leveraging FPGA technology for efficient resource…

Tag: post-training quantization

Hacker News: SVDQuant: 4-Bit Quantization Powers 12B Flux on a 16GB 4090 GPU with 3x Speedup

Hacker News: VPTQ: Extreme low-bit Quantization for real LLMs

Hacker News: LlamaF: An Efficient Llama2 Architecture Accelerator on Embedded FPGAs