Tag: model architecture
-
Hacker News: 20x faster convergence for diffusion models
Source URL: https://sihyun.me/REPA/ Source: Hacker News Title: 20x faster convergence for diffusion models Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses a novel technique, REPresentation Alignment (REPA), which enhances the performance of generative diffusion models by improving internal representation alignment with self-supervised visual representations. This method significantly increases training efficiency and…
-
Hacker News: FLUX1.1 [pro] – New SotA text-to-image model from Black Forest Labs
Source URL: https://replicate.com/black-forest-labs/flux-1.1-pro Source: Hacker News Title: FLUX1.1 [pro] – New SotA text-to-image model from Black Forest Labs Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses the pricing model and improvements of the FLUX1.1 [pro] image generation model, emphasizing its advancements in speed, quality, and efficiency over its predecessor. Detailed Description:…
-
Hacker News: MM1.5: Methods, Analysis and Insights from Multimodal LLM Fine-Tuning
Source URL: https://arxiv.org/abs/2409.20566 Source: Hacker News Title: MM1.5: Methods, Analysis and Insights from Multimodal LLM Fine-Tuning Feedly Summary: Comments AI Summary and Description: Yes Summary: The paper introduces MM1.5, a novel set of multimodal large language models (MLLMs) aimed at improving multimodal understanding and reasoning through enhanced training methodologies. It highlights innovative techniques in data…
-
Simon Willison’s Weblog: llama-3.2-webgpu
Source URL: https://simonwillison.net/2024/Sep/30/llama-32-webgpu/#atom-everything Source: Simon Willison’s Weblog Title: llama-3.2-webgpu Feedly Summary: llama-3.2-webgpu Llama 3.2 1B is a really interesting models, given its 128,000 token input and its tiny size (barely more than a GB). This page loads a 1.24GB q4f16 ONNX build of the Llama-3.2-1B-Instruct model and runs it with a React-powered chat interface directly…