Using Machine Learning to Transform Data into Cyber Threat Intelligence

Written by

Whether we realize it or not, our digital lives and what we see on the internet are controlled and determined by algorithms and analytics. Through them, businesses learn what our preferences are and what we’re drawn to in order to target us with information. The idea is to present us with information that is most relevant to us.

In the same way, cybersecurity professionals are constantly faced with an enormous amount of threat data to sift through and prioritize on a daily basis. In fact, “too much data to analyze” is the number one obstacle inhibiting companies from defending against cyber threats according to the 2019 Cyberthreat Defense Report by CyberEdge.

Yet, while the overwhelming majority (77%) of companies recognize that threat intelligence is important or very important to their overall security posture, most organizations can only comfortably research and utilize between 1 and 100 threat indicators weekly. That pace is not nearly enough to keep up with the evolving threat landscape, nor is it anywhere near able to handle the terabytes of potential threat data sourced from multiple providers needed for rigorous, resilient threat analysis.

To compound matters, the roll-out of 5G will increase the usage of devices, resulting in even more unprocessed data and, therefore, a bigger risk of cyber threats.

It is no wonder, then, that collecting and analyzing raw threat data today requires advanced analytics and technology, in addition to human intelligence, to efficiently and accurately evaluate and interpret the volume of threat data. If done correctly, the combination of threat analysis and data analysis utilized with machine learning can help security teams quickly turn raw data into effective operational cyber threat intelligence.

Cyber threat intelligence could incorporate any number of key elements such as the ongoing collection, normalization, research and analysis of threat data so that an individual or organization can construct a calculated and effective response to defend against a cyber-attack.

The need for cohesive cyber threat intelligence is fueled by the fact that there has been steady growth in the number and variety of threats, which includes an increase in malware families and threat groups. This growth has been synonymous with the emergence of commercialized cybercrime.

Therefore, there is a natural progression to categorizing and predicting threats, where analysis – aided by machine learning – can help enterprises become more resilient in defending their networks.

Categorizing and predicting threats

There are various ways in which threat researchers categorize or predict threats. For example, a “malware-centric approach” examines malware to create groupings or families that display similar traits, using the analysis methods described above.

Another approach is more “adversary-centric” and involves observing a criminal’s behavior, motivations, location and typical infrastructure while trying to piece the puzzle together with visible patterns. To help with the adversary identification process, threat researchers can use atomic threat indicators (IP addresses, file hashes, and domains as detection tools) as a first line of defense. However, adversaries can easily change these “lower-level” threat indictors. Therefore, researchers layer this information with a behavioral-based approach to categorize the patterns and behaviors of malware and adversaries.

By focusing on the behaviors of a threat actor, security researchers are giving themselves a better chance to identify and attribute the source of the threat. This is because it is more difficult and expensive for threat actors to change their overall infrastructure and identifiable traits. It’s the main reason the MITRE ATT&CK threat intelligence framework is commonly used and integrated into security technologies, as it is a preferred source of knowledge on adversary tactics and techniques.

Because the threat landscape and IT systems are constantly evolving (exacerbated by the explosion of devices connecting to networks), resilience is key to helping to protect a business from attack. To achieve resiliency in threat detection and response, security professionals should be using a layered approach to threat intelligence, one that includes the use of atomic threat indicators as a first line of defense and makes use of both a malware-centric and adversary-centric approach.

A layered approach can provide insight into an attack not just at one stage, but along multiple stages of the Kill Chain, from reconnaissance to command and control (C&C). If an adversary modifies their tactics along any of those stages, such as in the way they deliver a payload, defenders are more resilient in their ability to detect along another stage, for example, how the hacker installs malware on an asset.

Quality cyber threat intelligence is one of the most valuable resources defenders have to help protect an organization against threats. By curating multiple layers of threat intelligence, researchers can create comprehensive adversary profiles, which are more effective at predicting new and evolving threats.

Moreover, by using machine learning and data analytics, they are more equipped to deal with the enormous volume of global threat data that must be analyzed to gain this level of insight.

What’s hot on Infosecurity Magazine?