Cloud Security Alliance News Clipping Site

Tag: throughput rates

The Register: Benchmarks show even an old Nvidia RTX 3090 is enough to serve LLMs to thousands

Aug 23, 2024

—

by

system automation

in Uncategorized

Source URL: https://www.theregister.com/2024/08/23/3090_ai_benchmark/ Source: The Register Title: Benchmarks show even an old Nvidia RTX 3090 is enough to serve LLMs to thousands Feedly Summary: For 100 concurrent users, the card delivered 12.88 tokens per second—just slightly faster than average human reading speed If you want to scale a large language model (LLM) to a few…