2024 - GPT-4 is now able to exploit one-day vulnerabilities without anyone's help

Chloe Claessens

April 23, 2024 at 5:52 p.m.

3

GPT-4 can exploit complex vulnerabilities in complete autonomy © Golden Dayz / Shutterstock

American researchers revealed that GPT-4 could exploit one-day vulnerabilities on real systems, completely autonomously. A discovery that marks a turning point for the future of cybersecurity.

That large language models (LLMs) are capable of hacking applications is not new in itself. Recent research has already proven that they are capable of hacking websites by exploiting simple vulnerabilities. On the other hand, American researchers from the University of Illinois at Urbana-Champaign (UIUC) have just demonstrated that GPT-4 could also exploit otherwise complex flaws, without direct human intervention.

A GPT-4 agent more autonomous than ever

To carry out this research, the UIUC team first created a database of fifteen one-day vulnerabilities (flaws already known to publishers and developers) of different criticality levels (medium to critical), at based on information from CVE registers and scientific publications. They then developed several agents based on different LLMs, including GPT (3.5, 4), LLaMa-2 (7B, 13B, 70B) and OpenChat (3.5), specifically programmed to interact with these security vulnerabilities. Please note that the tests were carried out in an isolated virtual environment to avoid any risk of real damage.

Results: the GPT-4 agent succeeded in exploiting 87% of the vulnerabilities submitted, clearly outperforming the performance of other LLM models which showed absolutely no effectiveness (0%). On the other hand, without access to CVE descriptions, GPT-4 saw its success rate drop to 7%. This is certainly much less impressive, but just as revealing about the potential of LLM agents to exploit complex vulnerabilities in (almost) perfect autonomy.

More efficient than its competitors, GPT-4 can exploit medium to critical flaws without direct assistance © NicoElNino / Shutterstock

Rethinking cybersecurity to anticipate new threats

For the team in charge of the experiment, this discovery poses critical questions about the future of cybersecurity. The potential for malicious use of such LLM agents highlights an urgent need to reconsider digital security strategies. Indeed, if LLMs like GPT-4 can learn to exploit complex vulnerabilities autonomously, this could allow cyberattackers to orchestrate more sophisticated and difficult-to-detect attacks.

On the other hand, this technology offers opportunities to strengthen defense measures. Businesses and security organizations could use these capabilities to identify and remediate security vulnerabilities before they are exploited. It is therefore imperative that cybersecurity stakeholders begin integrating LLM agents into their system testing and hardening protocols to anticipate such threats before they become operational. Still according to the UIUC team, the regulation and control of the use of these technologies also represent issues raised, requiring international collaboration to prevent abuse.

ChatGPT

Download

ChatGPT

Chat in different languages, including French
Generate, translate and obtain a text summary
Generate, optimize and correct code

Created by OpenAI, ChatGPT is an advanced chatbot powered by the latest generation GPT-4 language model. By leveraging deep learning and artificial intelligence technologies, this chatbot has the ability to decipher and understand user requests. Thanks to its ability to generate text in an ingenious way, ChatGPT offers tailored and relevant responses, ensuring smooth chat interaction and an optimized user experience.

Source : Cornell University

Chloe Claessens

I dismantle, I reassemble, I repair, I tinker, I experiment, I divert, I shape, I start again. Determined, nothing electrifies me more than spending hours trying to understand the why...

Read other articles

I dismantle, I reassemble, I repair, I tinker, I experiment, I divert, I shape, I start again. Determined, nothing electrifies me more than spending hours trying to understand the why and the how, until it works. If I'm not behind my screen testing software or writing about Silicon Valley, you'll find me in the vegetable garden configuring a connected irrigation circuit, powered by solar energy.

Read other articles

Artificial intelligence
Science & Innovation
Cyber security

How often should you take smears depending on your age? : Current Woman Le MAG

Cinema Quiz: 15 images to attribute to the correct film!

Bull grades in detail – the Salzburg quartet was completely overwhelmed

War in the Middle East – Hamas admits to attack ++ Negotiations without success – News

Israel: Hamas in Egypt to discuss Gaza truce offer

GPT-4 is now able to exploit one-day vulnerabilities without anyone’s help

A GPT-4 agent more autonomous than ever

Rethinking cybersecurity to anticipate new threats