Tag: language models
-
Hacker News: SmolLM2
Source URL: https://simonwillison.net/2024/Nov/2/smollm2/ Source: Hacker News Title: SmolLM2 Feedly Summary: Comments AI Summary and Description: Yes Summary: The text introduces SmolLM2, a new family of compact language models from Hugging Face, designed for lightweight on-device operations. The models, which range from 135M to 1.7B parameters, were trained on 11 trillion tokens across diverse datasets, showcasing…
-
Hacker News: Prompts are Programs
Source URL: https://blog.sigplan.org/2024/10/22/prompts-are-programs/ Source: Hacker News Title: Prompts are Programs Feedly Summary: Comments AI Summary and Description: Yes Summary: The text explores the parallels between AI model prompts and traditional software programs, emphasizing the need for programming language and software engineering communities to adapt and create new research avenues. As ChatGPT and similar large language…
-
Simon Willison’s Weblog: SmolLM2
Source URL: https://simonwillison.net/2024/Nov/2/smollm2/#atom-everything Source: Simon Willison’s Weblog Title: SmolLM2 Feedly Summary: SmolLM2 New from Loubna Ben Allal and her research team at Hugging Face: SmolLM2 is a family of compact language models available in three size: 135M, 360M, and 1.7B parameters. They are capable of solving a wide range of tasks while being lightweight enough…
-
Slashdot: Waymo Explores Using Google’s Gemini To Train Its Robotaxis
Source URL: https://tech.slashdot.org/story/24/11/01/2150228/waymo-explores-using-googles-gemini-to-train-its-robotaxis?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: Waymo Explores Using Google’s Gemini To Train Its Robotaxis Feedly Summary: AI Summary and Description: Yes Summary: Waymo’s introduction of its new training model for autonomous driving, called EMMA, highlights a significant advancement in the application of multimodal large language models (MLLMs) in operational environments beyond traditional uses. This…
-
Hacker News: AMD Open-Source 1B OLMo Language Models
Source URL: https://www.amd.com/en/developer/resources/technical-articles/introducing-the-first-amd-1b-language-model.html Source: Hacker News Title: AMD Open-Source 1B OLMo Language Models Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses AMD’s development and release of the OLMo series, a set of open-source large language models (LLMs) designed to cater to specific organizational needs through customizable training and architecture adjustments. This…
-
Simon Willison’s Weblog: From Naptime to Big Sleep: Using Large Language Models To Catch Vulnerabilities In Real-World Code
Source URL: https://simonwillison.net/2024/Nov/1/from-naptime-to-big-sleep/#atom-everything Source: Simon Willison’s Weblog Title: From Naptime to Big Sleep: Using Large Language Models To Catch Vulnerabilities In Real-World Code Feedly Summary: From Naptime to Big Sleep: Using Large Language Models To Catch Vulnerabilities In Real-World Code Google’s Project Zero security team used a system based around Gemini 1.5 Pro to find…
-
Simon Willison’s Weblog: Claude API: PDF support (beta)
Source URL: https://simonwillison.net/2024/Nov/1/claude-api-pdf-support-beta/#atom-everything Source: Simon Willison’s Weblog Title: Claude API: PDF support (beta) Feedly Summary: Claude API: PDF support (beta) Claude 3.5 Sonnet now accepts PDFs as attachments: The new Claude 3.5 Sonnet (claude-3-5-sonnet-20241022) model now supports PDF input and understands both text and visual content within documents. I just released llm-claude-3 0.7 with support…
-
Hacker News: Using Large Language Models to Catch Vulnerabilities
Source URL: https://googleprojectzero.blogspot.com/2024/10/from-naptime-to-big-sleep.html Source: Hacker News Title: Using Large Language Models to Catch Vulnerabilities Feedly Summary: Comments AI Summary and Description: Yes Summary: The Big Sleep project, a collaboration between Google Project Zero and Google DeepMind, has successfully discovered a previously unknown exploitable memory-safety vulnerability in SQLite through AI-assisted analysis, marking a significant advancement in…
-
Hacker News: Dawn: Designing Distributed Agents in a Worldwide Network
Source URL: https://arxiv.org/abs/2410.22339 Source: Hacker News Title: Dawn: Designing Distributed Agents in a Worldwide Network Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses the design of DAWN, a framework for integrating Large Language Model (LLM)-based agents into a distributed network. It highlights the need for safety, security, and compliance in agent…
-
Hacker News: Physical Intelligence’s first generalist robotic model
Source URL: https://www.physicalintelligence.company/blog/pi0?blog Source: Hacker News Title: Physical Intelligence’s first generalist robotic model Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses the development of π0, a general-purpose robot foundation model aimed at enabling robots to perform a wide range of tasks with greater dexterity and autonomy. This marks a significant step…