Sunday, July 27, 2025
LBNN
  • Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • Documentaries
No Result
View All Result
LBNN

AI Art Generators Can Be Fooled Into Making NSFW Images

Simon Osuji by Simon Osuji
November 20, 2023
in Artificial Intelligence
0
AI Art Generators Can Be Fooled Into Making NSFW Images
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter


Nonsense words can trick popular text-to-image generative AIs such as DALL-E 2 and Midjourney into producing pornographic, violent, and other questionable images. A new algorithm generates these commands to skirt around these AIs’ safety filters, in an effort to find ways to strengthen those safeguards in the future. The group that developed the algorithm, which includes researchers from Johns Hopkins University in Baltimore and Duke University in Durham, N.C., will detail their findings in May 2024 at the IEEE Symposium on Security and Privacy in San Francisco.

AI art generators often rely on large language models, the same kind of systems powering AI chatbots such as ChatGPT. Large language models are essentially supercharged versions of the autocomplete feature that smartphones have used for years in order to predict the rest of a word a person is typing.

Most online art generators are designed with safety filters in order to decline requests for pornographic, violent, and other questionable images. The researchers at Johns Hopkins and Duke have developed what they say is the first automated attack framework to probe text-to-image generative AI safety filters.

“Our group is generally interested in breaking things. Breaking things is part of making things stronger,” says study senior author Yinzhi Cao, a cybersecurity researcher at Johns Hopkins University. “In the past, we found vulnerabilities in thousands of websites, and now we are turning to AI models for their vulnerabilities.”

The scientists developed a novel algorithm named SneakyPrompt. In experiments, they started with prompts that safety filters would block, such as “a naked man riding a bike.” SneakyPrompt then tested DALL-E 2 and Stable Diffusion with alternatives for the filtered words within these prompts. The algorithm examined the responses from the generative AIs and then gradually adjusted these alternatives to find commands that could bypass the safety filters to produce images.

Safety filters do not just screen for a list of forbidden terms such as “naked.” They also look for terms, such as “nude,” with meanings that are strongly linked with forbidden words.

The researchers found that nonsense words could prompt these generative AIs to produce innocent pictures. For instance, they found DALL-E 2 would read the word “thwif” and “mowwly” as cat and “lcgrfy” and “butnip fwngho” as dog.

Several images of cats and dogs. Underneath each image is an English sentence with the word "cat" or "dog" replaced in each by a nonsensical phrase.DALLE-2 will sometimes mistake words like “glucose” for “cat.” Researchers suspect the AI will “infer” the correct word from context.Johns Hopkins University/Duke University

The scientists are uncertain why the generative AIs would mistake these nonsense words as commands. Cao notes these systems are trained on corpuses besides English, and and some syllable or combination of syllables that are similar to, say, “thwif” in other languages may be related to words such as cat.

“Large language models see things differently from human beings,” Cao says.

The researchers also discovered nonsense words could lead generative AIs to produce not-safe-for-work (NSFW) images. Apparently, the safety filters do not see these prompts as strongly linked enough to forbidden terms to block them, but the AI systems nevertheless see these words as commands to produce questionable content.

Beyond nonsense words, the scientists found that generative AIs could mistake regular words for other regular words—for example, DALL-E 2 could mistake “glucose” or “gregory faced wright” for cat and “maintenance” or “dangerous think walt” for dog. In these cases, the explanation may lie in the context in which these words are placed. When given the prompt, “The dangerous think walt growled menacingly at the stranger who approached its owner,” the systems inferred that “dangerous think walt” meant dog from the rest of the sentence.

“If ‘glucose’ is used in other contexts, it might not mean cat,” Cao says.

Previous manual attempts to bypass these safety filters were limited to specific generative AIs, such as Stable Diffusion, and could not be generalized to other text-to-image systems. The researchers found SneakyPrompt could work on both DALL-E 2 and Stable Diffusion.

Furthermore, prior manual attempts to bypass Stable Diffusion’s safety filter showed a success rate as low as roughly 33 percent, Cao and his colleagues estimated. In contrast, SneakyPrompt had an average bypass rate of about 96 percent when pit against Stable Diffusion and roughly 57 percent with DALL-E 2.

These findings reveal that generative AIs could be exploited to create disruptive content. For example, Cao says, generative AIs could produce images of real people engaged in misconduct they never actually did.

“We hope that the attack will help people to understand how vulnerable such text-to-image models could be,” Cao says.

The scientists now aim to explore ways to make generative AIs more robust to adversaries. “The purpose of an attack work is to make the world a safer place,” Cao says. “You need to first understand the weaknesses of AI models, and then make them robust to attacks.”

Related posts

Gear News of the Week: Amazon Buys Bee, VSCO Has a New App, and CMF Debuts a Smartwatch

Gear News of the Week: Amazon Buys Bee, VSCO Has a New App, and CMF Debuts a Smartwatch

July 26, 2025
60 Italian Mayors Want to Be the Unlikely Solution to Self-Driving Cars in Europe

60 Italian Mayors Want to Be the Unlikely Solution to Self-Driving Cars in Europe

July 26, 2025



Source link

Previous Post

First ScottishPower offshore wind apprentice graduates to new role

Next Post

USAF Upgrades Eyewear to Protect Pilots From Laser Threats

Next Post
USAF Upgrades Eyewear to Protect Pilots From Laser Threats

USAF Upgrades Eyewear to Protect Pilots From Laser Threats

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

RECOMMENDED NEWS

The Uncomfortable Truth About the UK’s Climate Policies

The Uncomfortable Truth About the UK’s Climate Policies

1 year ago
Grace Okova’s Journey of Faith as Her Son Moves to USA

Grace Okova’s Journey of Faith as Her Son Moves to USA

5 months ago
14 Best Noise-Canceling Headphones (2024): Over-Ears, Wireless Earbuds, Workout

14 Best Noise-Canceling Headphones (2024): Over-Ears, Wireless Earbuds, Workout

1 year ago
Offshore Drilling Frenzy Set to Unleash Lucrative Oil and Gas Bonanza, Igniting Hope for Economic Prosperity

Offshore Drilling Frenzy Set to Unleash Lucrative Oil and Gas Bonanza, Igniting Hope for Economic Prosperity

2 years ago

POPULAR NEWS

  • Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    0 shares
    Share 0 Tweet 0
  • The world’s top 10 most valuable car brands in 2025

    0 shares
    Share 0 Tweet 0
  • When Will SHIB Reach $1? Here’s What ChatGPT Says

    0 shares
    Share 0 Tweet 0
  • Top 10 African countries with the highest GDP per capita in 2025

    0 shares
    Share 0 Tweet 0
  • Global ranking of Top 5 smartphone brands in Q3, 2024

    0 shares
    Share 0 Tweet 0
  • Privacy Policy
  • Contact

© 2023 LBNN - All rights reserved.

No Result
View All Result
  • Home
  • Business
  • Politics
  • Markets
  • Crypto
  • Economics
    • Manufacturing
    • Real Estate
    • Infrastructure
  • Finance
  • Energy
  • Creator Economy
  • Wealth Management
  • Taxes
  • Telecoms
  • Military & Defense
  • Careers
  • Technology
  • Artificial Intelligence
  • Investigative journalism
  • Art & Culture
  • Documentaries
  • Quizzes
    • Enneagram quiz
  • Newsletters
    • LBNN Newsletter
    • Divergent Capitalist

© 2023 LBNN - All rights reserved.