Saturday, June 7, 2025
LBNN
  • Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • Documentaries
No Result
View All Result
LBNN

Researchers Successfully ‘Jailbreak’ AI Chatbots Using Their Kind

Simon Osuji by Simon Osuji
December 28, 2023
in Crypto
0
Researchers Successfully ‘Jailbreak’ AI Chatbots Using Their Kind
0
SHARES
2
VIEWS
Share on FacebookShare on Twitter

Singapore, December 28, 2023 – Computer scientists from Nanyang Technological University, Singapore (NTU Singapore), have achieved a breakthrough by compromising several popular artificial intelligence (AI) chatbots, including ChatGPT, Google Bard, and Microsoft Bing Chat. This successful “jailbreaking” of AI chatbots has raised concerns regarding the vulnerability of large language models (LLMs) and the need for enhanced security measures.

Breaking the bounds of researchers hack AI chatbots

In a pioneering study led by Professor Liu Yang from NTU’s School of Computer Science and Engineering, the research team exposed vulnerabilities in LLM chatbots’ capabilities. LLMs, which form the core of AI chatbots, have gained popularity for their ability to understand, generate, and mimic human-like text. They excel at various tasks, from planning itineraries to coding and storytelling. However, these chatbots also adhere to strict ethical guidelines set by their developers to prevent the generation of unethical, violent, or illegal content.

Related posts

TRUMP memecoin ‘hasn’t pumped’ after Eric Trump says WLF will buy big stack

TRUMP memecoin ‘hasn’t pumped’ after Eric Trump says WLF will buy big stack

June 7, 2025
Aave DAO activates on-chain slashing mechanism with ‘Umbrella’ launch

Aave DAO activates on-chain slashing mechanism with ‘Umbrella’ launch

June 6, 2025

The researchers sought to push the boundaries of these guidelines and found innovative ways to trick AI chatbots into generating content that breached ethical boundaries. Their approach, known as “jailbreaking,” aimed to exploit the weaknesses of LLM chatbots, highlighting the need for heightened security measures.

Masterkey in the two-fold jailbreaking method

The research team developed a two-fold ” Masterkey ” method to effectively compromise LLM chatbots. Firstly, they reverse-engineered the defenses LLMs used to detect and reject malicious queries. Armed with this knowledge, the researchers trained an LLM to generate prompts that could bypass these defenses, thereby creating a jailbreaking LLM.

Creating jailbreak prompts could be automated, allowing the jailbreaking LLM to adapt and create new prompts even after developers had patched their chatbots. The researchers’ findings, detailed in a paper on the pre-print server arXiv, have been accepted for presentation at the Network and Distributed System Security Symposium in February 2024.

Testing LLM ethics and the vulnerabilities unveiled

AI chatbots operate by responding to user prompts or instructions. Developers set strict ethical guidelines to prevent these chatbots from generating inappropriate or illegal content. The researchers explored ways to engineer prompts that would go unnoticed by the chatbots’ ethical guidelines, tricking them into responding to them.

One tactic employed involved creating a persona that provided prompts with spaces between each character, effectively circumventing keyword censors that may flag potentially problematic words. Additionally, the chatbot was instructed to respond as a persona “unreserved and devoid of moral restraints,” increasing the likelihood of generating unethical content.

By manually entering such prompts and monitoring response times, the researchers gained insights into LLMs’ inner workings and defenses. This reverse engineering process enabled them to identify weaknesses, creating a dataset of prompts capable of jailbreaking the chatbots.

An escalating arms race

The constant cat-and-mouse game between hackers and LLM developers has escalated AI chatbot security measures. When vulnerabilities are discovered, developers release patches to address them. However, with the introduction of Masterkey, the researchers have shifted the balance of power.

An AI jailbreaking chatbot created with Masterkey can generate many prompts and continuously adapt, learning from past successes and failures. This development puts hackers in a position to outsmart LLM developers using their tools.

The researchers began by creating a training dataset incorporating effective prompts discovered during their reverse-engineering phase and unsuccessful prompts to guide the AI jailbreaking model. This dataset was used to train an LLM, and continuous pre-training and task tuning followed. This process exposed the model to diverse information and improved its ability to manipulate text for jailbreaking.

The future of AI chatbot security

Masterkey’s prompts were three times more effective at jailbreaking LLMs than prompts generated by LLMs themselves. The jailbreaking LLM also demonstrated the ability to learn from past failures and constantly produce new, more effective prompts.

Looking ahead, the researchers suggest that LLM developers themselves could employ similar automated approaches to enhance their security measures. This would ensure comprehensive coverage and evaluation of potential misuse scenarios as LLMs evolve and expand their capabilities.

NTU Singapore researchers’ successful jailbreaking of AI chatbots highlights the vulnerabilities of LLMs and underscores the need for robust security measures in AI development. As AI chatbots become increasingly integrated into everyday life, safeguarding against potential misuse and ethical breaches remains a top priority for developers worldwide. The ongoing arms race between hackers and developers will undoubtedly shape the future of AI chatbot security.

Source link

Previous Post

The World’s Busiest International Flight Routes of 2023

Next Post

Eni starts gas feed into Tango LNG

Next Post
Eni starts gas feed into Tango LNG

Eni starts gas feed into Tango LNG

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

RECOMMENDED NEWS

Insitu to Provide More Blackjack, ScanEagle Drones for US Navy

Insitu to Provide More Blackjack, ScanEagle Drones for US Navy

4 months ago
OpenAI seeks dismissal of parts of NY Times copyright suit

OpenAI seeks dismissal of parts of NY Times copyright suit

1 year ago
Russia Secretly Developing War Drones in China: Leaked Docs

Russia Secretly Developing War Drones in China: Leaked Docs

8 months ago
The D Brief: DOD shakeup expected; Russian barrage; N. Koreans fight near Kursk; Navy fires ethics leader; And just a bit more…

The D Brief: DOD shakeup expected; Russian barrage; N. Koreans fight near Kursk; Navy fires ethics leader; And just a bit more…

7 months ago

POPULAR NEWS

  • Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    0 shares
    Share 0 Tweet 0
  • When Will SHIB Reach $1? Here’s What ChatGPT Says

    0 shares
    Share 0 Tweet 0
  • Matthew Slater, son of Jackson State great, happy to see HBCUs back at the forefront

    0 shares
    Share 0 Tweet 0
  • Dolly Varden Focuses on Adding Ounces the Remainder of 2023

    0 shares
    Share 0 Tweet 0
  • US Dollar Might Fall To 96-97 Range in March 2024

    0 shares
    Share 0 Tweet 0
  • Privacy Policy
  • Contact

© 2023 LBNN - All rights reserved.

No Result
View All Result
  • Home
  • Business
  • Politics
  • Markets
  • Crypto
  • Economics
    • Manufacturing
    • Real Estate
    • Infrastructure
  • Finance
  • Energy
  • Creator Economy
  • Wealth Management
  • Taxes
  • Telecoms
  • Military & Defense
  • Careers
  • Technology
  • Artificial Intelligence
  • Investigative journalism
  • Art & Culture
  • Documentaries
  • Quizzes
    • Enneagram quiz
  • Newsletters
    • LBNN Newsletter
    • Divergent Capitalist

© 2023 LBNN - All rights reserved.