Tag: adversarial inputs
-
Hacker News: SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks
Source URL: https://arxiv.org/abs/2310.03684 Source: Hacker News Title: SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks Feedly Summary: Comments AI Summary and Description: Yes Summary: This text presents “SmoothLLM,” an innovative algorithm designed to enhance the security of Large Language Models (LLMs) against jailbreaking attacks, which manipulate models into producing undesirable content. The proposal highlights a…