American researchers revealed that GPT-4 could exploit one-day vulnerabilities on real systems, completely autonomously. A discovery that marks a turning point for the future of cybersecurity.
That large language models (LLMs) are capable of hacking applications is not new in itself. Recent research has already proven that they are capable of hacking websites by exploiting simple vulnerabilities. On the other hand, American researchers from the University of Illinois at Urbana-Champaign (UIUC) have just demonstrated that GPT-4 could also exploit otherwise complex flaws, without direct human intervention.
A GPT-4 agent more autonomous than ever
To carry out this research, the UIUC team first created a database of fifteen one-day vulnerabilities (flaws already known to publishers and developers) of different criticality levels (medium to critical), at based on information from CVE registers and scientific publications. They then developed several agents based on different LLMs, including GPT (3.5, 4), LLaMa-2 (7B, 13B, 70B) and OpenChat (3.5), specifically programmed to interact with these security vulnerabilities. Please note that the tests were carried out in an isolated virtual environment to avoid any risk of real damage.
Results: the GPT-4 agent succeeded in exploiting 87% of the vulnerabilities submitted, clearly outperforming the performance of other LLM models which showed absolutely no effectiveness (0%). On the other hand, without access to CVE descriptions, GPT-4 saw its success rate drop to 7%. This is certainly much less impressive, but just as revealing about the potential of LLM agents to exploit complex vulnerabilities in (almost) perfect autonomy.
Rethinking cybersecurity to anticipate new threats
For the team in charge of the experiment, this discovery poses critical questions about the future of cybersecurity. The potential for malicious use of such LLM agents highlights an urgent need to reconsider digital security strategies. Indeed, if LLMs like GPT-4 can learn to exploit complex vulnerabilities autonomously, this could allow cyberattackers to orchestrate more sophisticated and difficult-to-detect attacks.
On the other hand, this technology offers opportunities to strengthen defense measures. Businesses and security organizations could use these capabilities to identify and remediate security vulnerabilities before they are exploited. It is therefore imperative that cybersecurity stakeholders begin integrating LLM agents into their system testing and hardening protocols to anticipate such threats before they become operational. Still according to the UIUC team, the regulation and control of the use of these technologies also represent issues raised, requiring international collaboration to prevent abuse.
Download
- Chat in different languages, including French
- Generate, translate and obtain a text summary
- Generate, optimize and correct code
Created by OpenAI, ChatGPT is an advanced chatbot powered by the latest generation GPT-4 language model. By leveraging deep learning and artificial intelligence technologies, this chatbot has the ability to decipher and understand user requests. Thanks to its ability to generate text in an ingenious way, ChatGPT offers tailored and relevant responses, ensuring smooth chat interaction and an optimized user experience.
Created by OpenAI, ChatGPT is an advanced chatbot powered by the latest generation GPT-4 language model. By leveraging deep learning and artificial intelligence technologies, this chatbot has the ability to decipher and understand user requests. Thanks to its ability to generate text in an ingenious way, ChatGPT offers tailored and relevant responses, ensuring smooth chat interaction and an optimized user experience.
Source : Cornell University
3