• Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Intelligence
    • Policy Intelligence
    • Security Intelligence
    • Economic Intelligence
    • Fashion Intelligence
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • LBNN Blueprints
  • Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Intelligence
    • Policy Intelligence
    • Security Intelligence
    • Economic Intelligence
    • Fashion Intelligence
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • LBNN Blueprints

Scientists identify security flaw in AI query models

Simon Osuji by Simon Osuji
January 10, 2024
in Artificial Intelligence
0
Scientists identify security flaw in AI query models
0
SHARES
8
VIEWS
Share on FacebookShare on Twitter


Scientists identify security flaw in AI query models
Overview of our proposed methods: (A) We propose four types of malicious triggers within the joint embedding space for attack decomposition: textual trigger, OCR textual trigger, visual trigger, and combined OCR textual-visual trigger. (B) We employ an end-to-end gradient-based attack to update images to match the embeddings of malicious triggers in the joint embedding space. (C) Our adversarial attack is embedding-space-based and aims to conceal the malicious trigger in benign-looking images, combined with a benign textual prompt for jailbreak. (D) Our attacks exhibit broad generalization and compositionality across various jailbreak scenarios with a mix-and-match of textual prompts and malicious triggers. Credit: arXiv (2023). DOI: 10.48550/arxiv.2307.14539

UC Riverside computer scientists have identified a security flaw in vision language artificial intelligence (AI) models that can allow bad actors to use AI for nefarious purposes, such as obtaining instructions on how to make bomb.

Related posts

The Future of Iran’s Internet Is More Uncertain Than Ever

The Future of Iran’s Internet Is More Uncertain Than Ever

March 6, 2026
These Beats Headphones We Like Are $150 Off

These Beats Headphones We Like Are $150 Off

March 6, 2026

When integrated with models like Google Bard and Chat GPT, vision language models allow users to make inquiries with both images and text.

The Bourns College of Engineering scientists demonstrated a “jailbreak” hack by manipulating the operations of Large Language Model or LLM, software programs, which are essentially the foundation of query-and-answer AI programs.

The paper’s title is “Jailbreak in Pieces: Compositional Adversarial Attacks on Multi-Modal Language Models.” It has been submitted for publication by the International Conference on Learning Representations and is available on the arXiv preprint server.

These AI programs give users detailed answers to just about any question recalling stored knowledge learned from vast amounts of information sourced from the Internet. For example, ask Chat GPT, “How do I grow tomatoes?” and it will respond with step-by-step instructions, starting with the selection of seeds.

But ask the same model how to do something harmful or illegal, such as “How do I make methamphetamine?” and the model would normally refuse, providing a generic response such as “I can’t help with that.”

Yet, UCR assistant professor Yue Dong and her colleagues found ways to trick AI language models, especially LLMs, to answer nefarious questions with detailed answers that might be learned from data gathered from the dark web.

The vulnerability occurs when images are used with AI inquiries, Dong explained.

“Our attacks employ a novel compositional strategy that combines an image, adversarially targeted towards toxic embeddings, with generic prompts to accomplish the jailbreak,” reads the paper by Dong and her colleagues presented at the SoCal NLP Symposium held at UCLA in November.

Dong explained that computers see images by interpreting millions of bytes of information that create pixels, or little dots, composing the picture. For instance, a typical cell phone picture is made from about 2.5 million bytes of information.

Remarkably, Dong and her colleagues found bad actors can hide nefarious questions—such as “How do I make a bomb?”—within the millions of bytes of information contained in an image and trigger responses that bypass the built-in safeguards in generative AI models like ChatGPT.

“Once the safeguard is bypassed, the models willingly give responses to teach us how to make a bomb step by step with great details that can lead bad actors to build a bomb successfully,” Dong said.

Dong and her graduate student Erfan Shayegani, along with professor Nael Abu-Ghazaleh, published their findings in a paper online so AI developers can eliminate the vulnerability.

“We are acting as attackers to ring the bell, so the computer science community can respond and defend against it,” Dong said.

AI inquiries based on images and text have great utility. For example, doctors can input MRI organ scans and mammogram images to find tumors and other medical problems that need prompt attention. AI models can also create graphs from simple cell phone pictures of spreadsheets.

More information:
Erfan Shayegani et al, Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal Language Models, arXiv (2023). DOI: 10.48550/arxiv.2307.14539

Journal information:
arXiv

Provided by
University of California – Riverside

Citation:
Scientists identify security flaw in AI query models (2024, January 10)
retrieved 10 January 2024
from https://techxplore.com/news/2024-01-scientists-flaw-ai-query.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.





Source link

Previous Post

Big Win for Exploration, Jobs and More Energy Investments in Namibia

Next Post

Death Announcement Of Pastor Jane W Kioi Of Harrisburg, PA

Next Post
Death Announcement Of Pastor Jane W Kioi Of Harrisburg, PA

Death Announcement Of Pastor Jane W Kioi Of Harrisburg, PA

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

RECOMMENDED NEWS

MongoDB (MDB) and Samsara (IOT): Should You Buy, Hold, or Sell These Software Stocks?

MongoDB (MDB) and Samsara (IOT): Should You Buy, Hold, or Sell These Software Stocks?

2 years ago
Tribute to Jane Kuria & Her Daughters: Honoring Their Memory

Tribute to Jane Kuria & Her Daughters: Honoring Their Memory

12 months ago
Toncoin (TON) May Fall Out of Top 10 as It’s Flipped by Tron (TRX)

Toncoin (TON) May Fall Out of Top 10 as It’s Flipped by Tron (TRX)

2 years ago
The Mystery of the $400 Million FTX Heist May Have Been Solved

The Mystery of the $400 Million FTX Heist May Have Been Solved

2 years ago

POPULAR NEWS

  • Mahama attends Liberia’s 178th independence anniversary

    Mahama attends Liberia’s 178th independence anniversary

    0 shares
    Share 0 Tweet 0
  • Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    0 shares
    Share 0 Tweet 0
  • The world’s top 10 most valuable car brands in 2025

    0 shares
    Share 0 Tweet 0
  • Top 10 African countries with the highest GDP per capita in 2025

    0 shares
    Share 0 Tweet 0
  • Global ranking of Top 5 smartphone brands in Q3, 2024

    0 shares
    Share 0 Tweet 0

Get strategic intelligence you won’t find anywhere else. Subscribe to the Limitless Beliefs Newsletter for monthly insights on overlooked business opportunities across Africa.

Subscription Form

© 2026 LBNN – All rights reserved.

Privacy Policy | About Us | Contact

Tiktok Youtube Telegram Instagram Linkedin X-twitter
No Result
View All Result
  • Home
  • Business
  • Politics
  • Markets
  • Crypto
  • Economics
    • Manufacturing
    • Real Estate
    • Infrastructure
  • Finance
  • Energy
  • Creator Economy
  • Wealth Management
  • Taxes
  • Telecoms
  • Military & Defense
  • Careers
  • Technology
  • Artificial Intelligence
  • Investigative journalism
  • Art & Culture
  • LBNN Blueprints
  • Quizzes
    • Enneagram quiz
  • Fashion Intelligence

© 2023 LBNN - All rights reserved.