prompt injection attacks - Cloud Security Alliance News Clipping Site

Simon Willison’s Weblog: Notes from Bing Chat—Our First Encounter With Manipulative AI

Nov 19, 2024

—

by

Source URL: https://simonwillison.net/2024/Nov/19/notes-from-bing-chat/#atom-everything Source: Simon Willison’s Weblog Title: Notes from Bing Chat—Our First Encounter With Manipulative AI Feedly Summary: A participated in an Ars Live conversation with Benj Edwards of Ars Technica today, talking about that wild period of LLM history last year when Microsoft launched Bing Chat and it instantly started misbehaving, gaslighting and…

Hacker News: Garak, LLM Vulnerability Scanner

Nov 17, 2024

—

by

system automation

in Uncategorized

Source URL: https://github.com/NVIDIA/garak Source: Hacker News Title: Garak, LLM Vulnerability Scanner Feedly Summary: Comments AI Summary and Description: Yes **Summary:** The text describes “garak,” a command-line vulnerability scanner specifically designed for large language models (LLMs). This tool aims to uncover various weaknesses in LLMs, such as hallucination, prompt injection attacks, and data leakage. Its development…

Simon Willison’s Weblog: Quoting Question for Department for Science, Innovation and Technology

Nov 1, 2024

—

by

system automation

in Uncategorized

Source URL: https://simonwillison.net/2024/Nov/1/prompt-injection/#atom-everything Source: Simon Willison’s Weblog Title: Quoting Question for Department for Science, Innovation and Technology Feedly Summary: Lord Clement-Jones: To ask His Majesty’s Government what assessment they have made of the cybersecurity risks posed by prompt injection attacks to the processing by generative artificial intelligence of material provided from outside government, and whether…

The Register: Google reportedly developing an AI agent that can control your browser

Oct 28, 2024

—

by

system automation

in Uncategorized

Source URL: https://www.theregister.com/2024/10/28/google_ai_web_agent/ Source: The Register Title: Google reportedly developing an AI agent that can control your browser Feedly Summary: Project Jarvis will apparently conduct research, purchase products, and even book a flight on your behalf Google is reportedly looking to sidestep the complexity of AI-driven automation by letting its multimodal large language models (LLMs)…

Embrace The Red: ZombAIs: From Prompt Injection to C2 with Claude Computer Use

Oct 25, 2024

—

by

system automation

in Uncategorized

Source URL: https://embracethered.com/blog/posts/2024/claude-computer-use-c2-the-zombais-are-coming/ Source: Embrace The Red Title: ZombAIs: From Prompt Injection to C2 with Claude Computer Use Feedly Summary: A few days ago, Anthropic released Claude Computer Use, which is a model + code that allows Claude to control a computer. It takes screenshots to make decisions, can run bash commands and so forth.…

The Register: Anthropic’s latest Claude model can interact with computers – what could go wrong?

Oct 24, 2024

—

by

system automation

in Uncategorized

Source URL: https://www.theregister.com/2024/10/24/anthropic_claude_model_can_use_computers/ Source: The Register Title: Anthropic’s latest Claude model can interact with computers – what could go wrong? Feedly Summary: For starters, it could launch a prompt injection attack on itself… The latest version of AI startup Anthropic’s Claude 3.5 Sonnet model can use computers – and the developer makes it sound like…

Simon Willison’s Weblog: Quoting Model Card Addendum: Claude 3.5 Haiku and Upgraded Sonnet

Oct 23, 2024

—

by

system automation

in Uncategorized

Source URL: https://simonwillison.net/2024/Oct/23/model-card/#atom-everything Source: Simon Willison’s Weblog Title: Quoting Model Card Addendum: Claude 3.5 Haiku and Upgraded Sonnet Feedly Summary: We enhanced the ability of the upgraded Claude 3.5 Sonnet and Claude 3.5 Haiku to recognize and resist prompt injection attempts. Prompt injection is an attack where a malicious user feeds instructions to a model…

Simon Willison’s Weblog: Initial explorations of Anthropic’s new Computer Use capability

Oct 22, 2024

—

by

system automation

in Uncategorized

Source URL: https://simonwillison.net/2024/Oct/22/computer-use/#atom-everything Source: Simon Willison’s Weblog Title: Initial explorations of Anthropic’s new Computer Use capability Feedly Summary: Two big announcements from Anthropic today: a new Claude 3.5 Sonnet model and a new API mode that they are calling computer use. (They also pre-announced Haiku 3.5, but that’s not available yet so I’m ignoring it…

Simon Willison’s Weblog: This prompt can make an AI chatbot identify and extract personal details from your chats

Oct 22, 2024

—

by

system automation

in Uncategorized

Source URL: https://simonwillison.net/2024/Oct/22/imprompter/#atom-everything Source: Simon Willison’s Weblog Title: This prompt can make an AI chatbot identify and extract personal details from your chats Feedly Summary: This prompt can make an AI chatbot identify and extract personal details from your chats Matt Burgess in Wired magazine writes about a new prompt injection / Markdown exfiltration variant…

Hacker News: Hacker plants false memories in ChatGPT to steal user data in perpetuity

Sep 24, 2024

—

by

system automation

in Uncategorized

Source URL: https://arstechnica.com/security/2024/09/false-memories-planted-in-chatgpt-give-hacker-persistent-exfiltration-channel/ Source: Hacker News Title: Hacker plants false memories in ChatGPT to steal user data in perpetuity Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses a vulnerability discovered in ChatGPT that allowed for malicious manipulation of its long-term memory feature through prompt injection. While OpenAI has released a partial…

Tag: prompt injection attacks