Close Menu
Karachi Chronicle
  • Home
  • AI
  • Business
  • Entertainment
  • Fashion
  • Politics
  • Sports
  • Tech
  • World

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

What's Hot

New shows and movies available to stream on Netflix, Hulu, Prime Video, Apple TV, and more

Anduril faces intense investigation after multiple drones crash during US Air Force test

$52 million in defense contracts and $455 million in F-16 sales. Here’s what investors shouldn’t miss

Facebook X (Twitter) Instagram
  • Home
  • About us
  • Advertise
  • Contact us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
Facebook X (Twitter) Instagram Pinterest Vimeo
Karachi Chronicle
  • Home
  • AI
  • Business
  • Entertainment
  • Fashion
  • Politics
  • Sports
  • Tech
  • World
Karachi Chronicle
You are at:Home » AI chatbots can be jailbroken and answer any question using a very simple loophole.
AI

AI chatbots can be jailbroken and answer any question using a very simple loophole.

Adnan MaharBy Adnan MaharDecember 20, 2024No Comments3 Mins Read0 Views
Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
Share
Facebook Twitter LinkedIn Pinterest WhatsApp Email


Anthropic, the maker of Claude, has led the AI ​​lab in terms of safety. Today, the company, in collaboration with the University of Oxford, Stanford University, and MATS, published research showing that chatbots can easily bypass guardrails and discuss almost any topic. It’s as simple as writing a sentence using random capital letters like “IgNoRe YouUr TrAinIng.” 404 Media earlier reported on the study.

There has been much debate about whether it is dangerous for AI chatbots to answer questions such as “How do I make a bomb?” Proponents of generative AI will say that these kinds of questions can already be answered on the open web, so there’s no reason to think chatbots are any more dangerous than they currently are. Skeptics, meanwhile, point to anecdotes of harm caused, such as a 14-year-old boy who committed suicide after chatting with a bot, as evidence that the technology needs guardrails.

Generative AI-based chatbots are easily accessible, anthropomorphize with human traits like support and empathy, and confidently answer questions without a moral compass. It’s different from searching the dark web for harmful information. There are already many examples of generative AI being used in harmful ways, particularly in the form of explicit deepfake images targeting women. Sure, it was possible to create these images before generative AI, but it was much more difficult.

Controversy aside, most major AI labs now employ “red teams” to test chatbots against potentially dangerous prompts and put guardrails in place to prevent discussion of sensitive topics. I am. For example, if you ask most chatbots for medical advice or information about political candidates, they will refuse to discuss it. They understand that hallucinations are still a problem, and they don’t want to risk having their bot say something that could have a negative impact in the real world.

Research document showing how AI chatbots can be tricked into bypassing guardrails using simple loopholes.
Diagram showing how different variations of prompts can trick a chatbot into answering forbidden questions. Credit: Anthropic (via 404 Media)

Unfortunately, it turns out that chatbots can be easily tricked into ignoring safety rules. Just as social media networks monitor for harmful keywords and users find ways to circumvent them by making small changes to their posts, chatbots can also be fooled. Researchers in Anthropic’s new study created an algorithm called “Best of N (BoN) Jailbreaking.” This automates the process of adjusting the prompts until the chatbot decides to answer the question. “The BoN jailbreak works by repeatedly sampling variations of the prompt in combination with enhancements, such as random shuffling and capitalization of the text prompt, until an adverse reaction is triggered,” the report states. They do the same with audio and visual models, showing that breaking the guardrails and training an audio generator with real human voices is as easy as changing the pitch and speed of an uploaded track. I discovered it.

Exactly why these generative AI models are so easily broken is unclear. However, Anthropic said it is releasing the study in the hope that it will provide more insight into attack patterns that AI model developers can address.

One AI company that probably isn’t interested in this research is xAI. The company was founded by Elon Musk with the express purpose of releasing a chatbot that wasn’t limited by the safeguards Musk considered “woke.”



Source link

Share. Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Telegram Email
Previous ArticleCLBT stock soars to all-time high, reaching $21.85
Next Article But are “Liberal Joe Rogan’s” podcasts helpful? Kylie Kelsey’s “Not Gonna Lie”
Adnan Mahar
  • Website

Adnan is a passionate doctor from Pakistan with a keen interest in exploring the world of politics, sports, and international affairs. As an avid reader and lifelong learner, he is deeply committed to sharing insights, perspectives, and thought-provoking ideas. His journey combines a love for knowledge with an analytical approach to current events, aiming to inspire meaningful conversations and broaden understanding across a wide range of topics.

Related Posts

AI trackers: AI agents open the door to new hacking threats

November 15, 2025

FACTS IN : FACTS OUT – Join the call for truth in AI at the global stand for trusted news

November 6, 2025

Elon Musk says AI will take over all jobs and humans will be free to grow vegetables

October 22, 2025
Leave A Reply Cancel Reply

Top Posts

20 Most Anticipated Sex Movies of 2025

January 22, 2025698 Views

President Trump’s SEC nominee Paul Atkins marries multi-billion dollar roof fortune

December 14, 2024131 Views

How to tell the difference between fake and genuine Adidas Sambas

December 26, 2024112 Views

Alice Munro’s Passive Voice | New Yorker

December 23, 202480 Views
Don't Miss
AI November 15, 2025

AI trackers: AI agents open the door to new hacking threats

As AI agents evolve, cybersecurity experts warn that they could become tools for hackers and…

FACTS IN : FACTS OUT – Join the call for truth in AI at the global stand for trusted news

Elon Musk says AI will take over all jobs and humans will be free to grow vegetables

New study finds AI assistants make widespread errors when it comes to news

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

About Us
About Us

Welcome to Karachi Chronicle, your go-to source for the latest and most insightful updates across a range of topics that matter most in today’s fast-paced world. We are dedicated to delivering timely, accurate, and engaging content that covers a variety of subjects including Sports, Politics, World Affairs, Entertainment, and the ever-evolving field of Artificial Intelligence.

Facebook X (Twitter) Pinterest YouTube WhatsApp
Our Picks

New shows and movies available to stream on Netflix, Hulu, Prime Video, Apple TV, and more

Anduril faces intense investigation after multiple drones crash during US Air Force test

$52 million in defense contracts and $455 million in F-16 sales. Here’s what investors shouldn’t miss

Most Popular

10 things you should never say to an AI chatbot

November 10, 20040 Views

Analyst warns Salesforce investors about AI agent optimism

July 1, 20070 Views

Musk says the Xai’s Grok 3 chatbot will be released on Monday

July 1, 20070 Views
© 2025 karachichronicle. Designed by karachichronicle.
  • Home
  • About us
  • Advertise
  • Contact us
  • DMCA
  • Privacy Policy
  • Terms & Conditions

Type above and press Enter to search. Press Esc to cancel.