2024 - From ChatGPT to Google Bard, a security flaw affects generative AIs

Camille Coirault

September 11, 2023 at 1:00 p.m.

0

This isn’t the first time, and it certainly won’t be the last. Once again, cybersecurity researchers have uncovered a new exploitable flaw within AI-powered language models.

After the discovery of measures to circumvent chatbot protections, this time another type of software weakness was spotted by these researchers. These language models can be easily manipulated by anyone with sufficient knowledge of computer science or cybersecurity. This new way of hijacking chatbots like Bard or ChatGPT is called “Indirect Prompt Injection”.

Indirect Prompt Injection: a clever but potentially dangerous diversion

When you interact with a chatbot powered by generative AI, you type a query in text form. These instructions, called “prompts”, then allow the system to access your request directly. To prevent illegal or fraudulent use, chatbots have protections preventing them from providing information if the prompt in question is suspicious. ChatGPT or Google Bard will never give you a foolproof method for organizing an assassination or a bank robbery, for example. Still happy.

In practice, and for most users, these protections work and are effective. The recent discovery of these researchers is, however, worrying. Instead of providing a prompt directly, it is possible to provide hidden instructions (in a PDF or web page, for example) to a model to make the AI act by ignoring its protective measures. Hundreds of cases of Indirect Prompt Injection have already been documented, and this is clearly only the beginning.

Chatbot © © Deemerwha studio / Shutterstock

A practice that tends to accelerate

With this technique, the field of possibilities is wide open: data theft, execution of malicious code or manipulation of information. The head of information security at Google DeepMind, Vijay Bolina, assures that this threat is serious. If this indirect injection technique was previously considered “problematic”, today it is viewed with much more concern. Indeed, this type of misuse was rather rare, but things have changed, and this process is more and more frequent since it is possible to connect language models to the Internet and to different plugins.

Even if there is no miracle solution, Bolina assures for his part that Google DeepMind is working seriously on the development of AI models capable of identifying this type of suspicious activity. Once again, it’s a game of cat and mouse between service providers and hackers. With always this same question remaining unanswered: who will be the quickest to lose the other?

Source : Wired

Artificial intelligence
Science and technology

Bundesliga in the TICKER – LIVE from 5 p.m.: LASK has to play against Sturm Graz

Jean Reno far from Paris, a “chaotic and violent” city: he talks about his life in a sought-after area of New York with his wife Zofia

you will definitely enjoy a slice of seafood charcuterie!

what price are you willing to pay for a tablet in 2024?

Better than the film with Brad Pitt and Tom Cruise? The series version of Interview with the Vampire arrives today in France!

From ChatGPT to Google Bard, a security flaw affects generative AIs

Indirect Prompt Injection: a clever but potentially dangerous diversion

A practice that tends to accelerate

Bundesliga in the TICKER – LIVE from 5 p.m.: LASK has to play against Sturm Graz

Jean Reno far from Paris, a “chaotic and violent” city: he talks about his life in a sought-after area of ​​New York with his wife Zofia

you will definitely enjoy a slice of seafood charcuterie!

what price are you willing to pay for a tablet in 2024?

Better than the film with Brad Pitt and Tom Cruise? The series version of Interview with the Vampire arrives today in France!

Indirect Prompt Injection: a clever but potentially dangerous diversion

A practice that tends to accelerate

Jean Reno far from Paris, a “chaotic and violent” city: he talks about his life in a sought-after area of New York with his wife Zofia