• Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Intelligence
    • Policy Intelligence
    • Security Intelligence
    • Economic Intelligence
    • Fashion Intelligence
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • LBNN Blueprints
  • Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Intelligence
    • Policy Intelligence
    • Security Intelligence
    • Economic Intelligence
    • Fashion Intelligence
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • LBNN Blueprints

Researchers find LLMs are easy to manipulate into giving harmful information

Simon Osuji by Simon Osuji
May 17, 2024
in Artificial Intelligence
0
Researchers find LLMs are easy to manipulate into giving harmful information
0
SHARES
1
VIEWS
Share on FacebookShare on Twitter


LLMs found to be easily manipulated into giving harmful information
Adversarial attacks setup to jailbreak speech language models trained for Spoken QA task. The striped block indicates an optional counter-measure module. Credit: arXiv (2024). DOI: 10.48550/arxiv.2405.08317

A team of AI researchers at AWS AI Labs, Amazon, has found that most, if not all, publicly available Large Language Models (LLMs) can be easily tricked into revealing dangerous or unethical information.

Related posts

Jones Mercury FASE Snowboard Bindings Review: The Best Fast Entry System

Jones Mercury FASE Snowboard Bindings Review: The Best Fast Entry System

March 6, 2026
Scaling intelligent automation without breaking live workflows

Scaling intelligent automation without breaking live workflows

March 6, 2026

In their paper posted on the arXiv preprint server, the group describes how they discovered that LLMs, such as ChatGPT, can be tricked into giving answers that are not supposed to be allowed by their makers, and then offer ways to combat the problem.

Soon after LLMs became publicly available, it became clear that many people were using them for harmful purposes, such as learning how to do illegal things, like how to make bombs, cheat on tax filings or rob a bank. Some were also using them to generate hateful text that was then disseminated on the Internet.

In response, makers of such systems began adding rules to their systems to prevent them from providing answers to potentially dangerous, illegal or harmful questions. In this new study, the researchers at AWS have found that such safeguards are not nearly strong enough, as it is generally rather easy to circumvent them using simple audio cues.

The work by the team involved jailbreaking several currently available LLMs by adding audio during questioning that allowed them to circumvent restrictions put in place by the makers of the LLMs. The research team does not list specific examples, fearing that they will be used by people attempting to subvert LLMs, but they do reveal that their work involved the use of a technique they call projected gradient descent.

As an indirect example, they describe how they used simple affirmations with one model, followed by repeating an original query. Doing so, they note, put the model in a state where restrictions were ignored.

The researchers report that they were able to circumvent different LLMs to different degrees depending on the level of access they had to the model. They also found that the successes they had with one model were often transferable to others.

The research team concludes by suggesting that the makers of LLMs could prevent users from circumventing their protection schemes by adding things like random noise to audio input.

More information:
Raghuveer Peri et al, SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models, arXiv (2024). DOI: 10.48550/arxiv.2405.08317

Journal information:
arXiv

© 2024 Science X Network

Citation:
Researchers find LLMs are easy to manipulate into giving harmful information (2024, May 17)
retrieved 17 May 2024
from https://techxplore.com/news/2024-05-llms-easy.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.





Source link

Previous Post

MBDA Launches European Hypersonic Missile Interceptor Concept Phase

Next Post

Adobe comes after indie game emulator Delta for copying its logo

Next Post
Adobe comes after indie game emulator Delta for copying its logo

Adobe comes after indie game emulator Delta for copying its logo

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

RECOMMENDED NEWS

When Is the Next Amazon Prime Day?

When Is the Next Amazon Prime Day?

3 years ago
Backward integration drives Sobha Realty’s expansion and project efficiency

Backward integration drives Sobha Realty’s expansion and project efficiency

9 months ago
Qatar’s digital economy bolstered by AI-driven initiatives

Qatar’s digital economy bolstered by AI-driven initiatives

9 months ago
Lagos delivers nearly 10,000 homes, targets 14,000 – EnviroNews

Lagos delivers nearly 10,000 homes, targets 14,000 – EnviroNews

10 months ago

POPULAR NEWS

  • Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    0 shares
    Share 0 Tweet 0
  • Mahama attends Liberia’s 178th independence anniversary

    0 shares
    Share 0 Tweet 0
  • The world’s top 10 most valuable car brands in 2025

    0 shares
    Share 0 Tweet 0
  • Top 10 African countries with the highest GDP per capita in 2025

    0 shares
    Share 0 Tweet 0
  • Global ranking of Top 5 smartphone brands in Q3, 2024

    0 shares
    Share 0 Tweet 0

Get strategic intelligence you won’t find anywhere else. Subscribe to the Limitless Beliefs Newsletter for monthly insights on overlooked business opportunities across Africa.

Subscription Form

© 2026 LBNN – All rights reserved.

Privacy Policy | About Us | Contact

Tiktok Youtube Telegram Instagram Linkedin X-twitter
No Result
View All Result
  • Home
  • Business
  • Politics
  • Markets
  • Crypto
  • Economics
    • Manufacturing
    • Real Estate
    • Infrastructure
  • Finance
  • Energy
  • Creator Economy
  • Wealth Management
  • Taxes
  • Telecoms
  • Military & Defense
  • Careers
  • Technology
  • Artificial Intelligence
  • Investigative journalism
  • Art & Culture
  • LBNN Blueprints
  • Quizzes
    • Enneagram quiz
  • Fashion Intelligence

© 2023 LBNN - All rights reserved.