Are GPT-Based Models the Right Fit for AI-Powered Cybersecurity?

A growing number of cybersecurity vendors are integrating large-language model-based (LLM) tools into their offerings. Many are opting to use OpenAI’s GPT model.

Microsoft launched its GPT-4-powered Security Copilot in March and in April Recorded Future added a new research feature using OpenAI’s model trained on 40,000 threat intelligence data points.

Software supply chain security provider OX Security followed in May and email security developer Ironscales announced GPT-powered functionalities during Infosecurity Europe in June.

Many other vendors are looking to levering LLMs as well. During Infosecurity Europe, Mayur Upadhyaya, CEO of API security provider Contxt told Infosecurity that his company had “secured an innovation grant in 2021, before the emergence of foundational models, to build a machine learning model for personal data detection, with a proprietary dataset. We are now trying to see how we can leverage foundational models with this dataset.”

Non-Deterministic AI Algorithms

LLMs are not the first type of AI that’s been integrated into cybersecurity products, with many Infosecurity Europe exhibitors – the likes of BlackBerry Cyber Security’s Cylance AI, Darktrace, Ironscales and Egress – leveraging AI in their products.

However, although it’s difficult to say what AI algorithms cybersecurity vendors have used, they are very likely deterministic.

Jack Chapman, VP of threat intelligence at Egress, told Infosecurity that his company was using “genetic programming, behavioral analytics-based algorithms, as well as social graphs.”

Ronnen Brunner, SVP of International Sales at Ironscales, said during his presentation at Infosecurity Europe that his firm was using “a broad range of algorithms, including some leveraging natural language processing (NLP), but not LLMs yet.”

According to Nicolas Ruff, a senior software engineer at Google, most AI algorithms used in cybersecurity are classifiers, a type of machine learning algorithm used to assign a class label to a data input.

These and all the above-mentioned machine learning models differ from LLMs and other generative AI models because they work in a closed loop and have built-on restrictions.

LLMs have been built on massive training sets. They’re also designed to guess the most probable words following a given prompt. These two features make them probabilistic and not deterministic – which means they provide the most probable answer, not necessarily the right one.

Just Another Tool in the Toolbox

Current general-purpose LLMs tend to hallucinate, which means they will give a convincing response but one that is entirely wrong.

Speaking to Infosecurity during Infosecurity Europe, Jon France, CISO of the non-profit (ISC)2, acknowledged that this makes current LLMs a risky tool for cybersecurity practices, where accuracy and precision are critical.

“LLMs can still be useful for various security purposes, like crafting security policies for everyone to understand,” he added.

Ganesh Chellappa, the head of support services at ManageEngine, agreed: “Anyone who has been using any user and entity behavior analytics (UEBA) solutions for many years has a huge amount of data that is just sitting there that they were never able to use. Now that LLMs are here, it’s not even a question; we must try and leverage them to make use of this data.”

Meanwhile, Chapman argued: “They can also be helpful for cybersecurity practitioners as a data pre-processing tool in areas such as anomaly detection (email security, endpoint protection…) or threat intelligence.”

At this stage of development, France and Chapman insisted that the key thing to remember in using LLMs in cybersecurity is “to consider them as another tool in the toolbox – and one that should never be responsible for executive tasks.”

Open Source LLMs

According to Chellappa, the hallucination concerns will largely be solved when cybersecurity firms develop their own models from open source frameworks like Meta’s LLaMA or Stanford University’s Alpaca and use them to train their own datasets.

However, SoSafe’s CEO, Dr. Niklas Hellemann, warned that the open source models won’t solve another growing issue LLM-based tools face: model poisoning.

Model poisoning refers to hacking techniques where an adversary can inject bad data into your model's training pool and get it to learn something it shouldn't.

“Open source models like LLaMA are already targeted with these attacks,” Hellemann told Infosecurity.

Are GPT-Based Models the Right Fit for AI-Powered Cybersecurity?

Kevin Poireault

Non-Deterministic AI Algorithms

Just Another Tool in the Toolbox

Open Source LLMs

You may also like

One Year of ChatGPT: The Impact of Generative AI on Cybersecurity

Second Half of 2023 Threat Landscape Dominated by AI and Android Spyware

Google Launches Framework to Secure Generative AI

ChatGPT Leveraged to Enhance Software Supply Chain Security

5 Wackiest Cybersecurity Stories of 2023

What’s hot on Infosecurity Magazine?

Most IT Leaders Say Severity of Cyber-Attacks has Increased

Chinese Espionage Group Upgrades Malware Arsenal to Target All Major OS

Russia Shifts Cyber Focus to Battlefield Intelligence in Ukraine

Exclusive: Paris 2024 CISO Reveals Cybersecurity Plans for the Olympics

Prolific DDoS Marketplace Shut Down by UK Law Enforcement

Cybercriminals Exploit CrowdStrike Outage Chaos

Fact vs. Fiction: Dispelling Zero Trust Misconceptions

Cybercriminals Exploit CrowdStrike Outage Chaos

Exclusive: Paris 2024 CISO Reveals Cybersecurity Plans for the Olympics

CISA's Jack Cable Discusses US Push for More Secure Software

Chinese Espionage Group Upgrades Malware Arsenal to Target All Major OS

North Korean Hackers Targeted Cybersecurity Firm KnowBe4 with Fake IT Worker

The Future of Fraud: Defending Against Advanced Account Attacks

Mastering IP & Data Security in the Industrial Age

Experiencing a DDoS Simulation to Enhance Defenses

How to Unlock Frictionless Security with Device Identity & MFA

Adapting to Tomorrow's Threat Landscape: AI's Role in Cybersecurity and Security Operations in 2024

How to Proactively Remediate Rising Web Application Threats

#Infosec2024: Claire Williams on Leadership, Cultivating a High Performing Team and Overcoming Adversity (video)

#Infosec2024: Navigating the Ransomware Toll on Victims with Jason Nurse (video)

#Infosec2024: Experts Share How CISOs Can Manage Change as the Only Constant

#Infosec2024: 104 EU Laws Have Different Definitions of Cybersecurity

Infosecurity Magazine Autumn Online Summit 2024: Day Two

Infosecurity Magazine Autumn Online Summit 2024: Day One

Are GPT-Based Models the Right Fit for AI-Powered Cybersecurity?

Written by

Non-Deterministic AI Algorithms

Just Another Tool in the Toolbox

Open Source LLMs

You may also like

What’s hot on Infosecurity Magazine?