Distributed Correlation and the Future of SIEM

Increasingly inter-connected IT systems are generating an ever-greater volume of data. Indeed, it is an oft-repeated claim that 90% of all the world’s data has been generated within the last two years; an exponential increase which shows no sign of slowing.

All of this data and metadata does, however, present two particular challenges: that of how and where to store it, and how to efficiently distribute and transmit it between systems that need to utilize it.

While these challenges are real for every area of an organization’s IT infrastructure, they are of particular concern for its IT operations.

Fortunately, an evolution in innovative SecOps solutions has helped to address these pain points, enabling organizations to scale the rate at which they ingest event data, and allowing it to be consumed by third-party big data and analytics solutions. But, this in turn has presented a new challenge: that of gleaning intelligence from the huge volume of available data without drowning in the ‘data lake’.

Until recently, this has only been possible through the use of a CPU and memory-intensive analytics and correlation. Now the answer to addressing current limitations around data processing may lie in distributed correlation.

Scaling up and scaling out
Architects are typically faced with a choice between either ‘scaling up’ or ‘scaling out’. The former approach involves bigger and ever more powerful machines which, while they may work for some applications, can be expensive, involving frequent cost-prohibitive ‘fork-lift’ upgrades. The latter is an approach commonly taken by pioneers such as Google and Facebook, and involves spreading workloads across clustered systems, often built with commodity hardware, with the additional benefit of improved fault tolerance and availability that increases as the cluster grows.

Whichever approach is chosen, it is likely to impact an organization’s SIEM application and correlation engine.

According to a well-known software engineer, “all problems in computer science can be solved by another level of indirection”, referring to alternate ways of solving a problem – ideally with some additional benefit.

One way of interpreting this is in the interfacing of modular software designs with different functions to solve a more complex problem which, if done correctly, allows for core software components to be decoupled and, in some cases, distributed across hardware systems. The downside of this approach is that it can prove overwhelming to an organization’s SIEM.

Freeing up the manager
SIEM tools have traditionally comprised distinct services, processing and analyzing events in real time, and packaging them into a ‘persister’ or ‘manager’. The manager, however, is required to juggle everything from rules and data monitors, to database and active list read/writes, any of which can bring it to its knees. CPU and memory exhaustion from poorly written rules or bloated active lists, for example, are common issues.

‘Exploding’ these services across clustered hosts will enable organizations to run multiple instances of correlator and aggregator services across many hosts, freeing up the manager to focus on writing and retrieving events from the event database.

Essentially, correlators are CPU-intensive, aggregators are memory-intensive, and the persister is I/O-intensive. Distributed correlation, on the other hand, enables organizations to scale out their SIEM in a number of different ways, allowing it to meet the most demanding needs and the most complex use cases.

Efficiently extracting intelligence
Even with recent advances in throughput, more robust storage options, and more powerful processing hardware, most organizations will find themselves constantly evaluating the cost-benefit of event ingestion into their centralized SIEM and analytics tools.

Distributed correlation, however, offers the opportunity to throw significantly more data at the correlation engine which will, in turn ‘bubble up’ the events of interest (EOI). As a result, events that once may have been too much to handle, such as endpoint logs, threat intelligence matches, DNS logs, or net flows, can now be used in the correlation logic to provide more contextual data around EOI, and to improve the fidelity of alert rules.

Different storage retention policies mean these correlation events can be retained longer than the actual base events, enabling the SIEM and its correlation engine to sort the wheat from the chaff. What’s more, by adding security context to the raw data in real time, it becomes instantly usable for analysis.

At the center of an intelligent SOC lies the ability to efficiently extract intelligence from the huge volume available to organizations today. Distributed correlation represents a way of unlocking that capability, by offering a powerful new way for organizations

Distributed Correlation and the Future of SIEM

Chas Clawson

You may also like

Security AI is more than an Algorithm

#HowTo Avoid Common Configuration Sins

Is Your InfoSec Tech Stack Causing Dangerous Blind Spots?

Looking Back at Microsoft Ignite 2019 - Tech Intensity, End to End Security and AI

Security Operations – Technology and Architecture

What’s hot on Infosecurity Magazine?

Cyber Agencies Warn of Fast Flux Threat Bypassing Network Defenses

Amateur Hacker Leverages Russian Bulletproof Hosting Server to Spread Malware

Royal Mail Investigates Data Breach Affecting Supplier

Over Half of Attacks on Electricity and Water Firms Are Destructive

Major Online Platform for Child Exploitation Dismantled

Nearly 600 Phishing Domains Emerge Following Bybit Heist

Stripe API Skimming Campaign Unveils New Techniques for Theft

New Phishing Attack Combines Vishing and DLL Sideloading Techniques

NIST Warns of Significant Limitations in AI/ML Security Mitigations

No MFA? Expect Hefty Fines, UK’s ICO Warns

Nearly 600 Phishing Domains Emerge Following Bybit Heist

Cyber Security and Resilience Bill Will Apply to 1000 UK Firms

How to Implement Attack Surface Management in the AI and Cloud Age

AI Agents and the Evolving Landscape of Digital Identity

How to Update Your PAM Strategy to Protect Hybrid Cloud Infrastructures

Cyber Resilience in the AI Era: New Challenges and Opportunities

The Threat Intelligence Imperative: Transforming Risk into Cyber Resilience

New Cyber Regulations: What it Means for UK and EU Businesses

Gatwick Airport's Cybersecurity Chief on Supply Chain Risks and CrowdStrike Outage

You're Hired! The Truth About Certifications in Cybersecurity Careers

T-Mobile Claims Salt Typhoon Did Not Access Customer Data

Darknet Services Fuel Holiday Scams and E-Commerce Exploits

Top 10 Cyber-Attacks of 2024

Google Deindexes Chinese Propaganda Network

Distributed Correlation and the Future of SIEM

Written by

You may also like

What’s hot on Infosecurity Magazine?