Tag: quantization
-
The Register: AMD’s Victor Peng: AI thirst for power underscores the need for efficient silicon
Source URL: https://www.theregister.com/2024/08/29/ai_thirst_for_power/ Source: The Register Title: AMD’s Victor Peng: AI thirst for power underscores the need for efficient silicon Feedly Summary: Moore’s Law may be running out of steam, but there are still knobs to turn and levers to pull Hot Chips Speaking at Hot Chips this week, AMD president Victor Peng addressed one…
-
The Register: Benchmarks show even an old Nvidia RTX 3090 is enough to serve LLMs to thousands
Source URL: https://www.theregister.com/2024/08/23/3090_ai_benchmark/ Source: The Register Title: Benchmarks show even an old Nvidia RTX 3090 is enough to serve LLMs to thousands Feedly Summary: For 100 concurrent users, the card delivered 12.88 tokens per second—just slightly faster than average human reading speed If you want to scale a large language model (LLM) to a few…