How to Build an Autonomic Security Operations Center (SOC)

Written by

The security industry has always wrestled with the practicalities of modernizing or automating the security operations center (SOC). Alert fatigue, false positives, burnout, churn, a changing threat landscape, global skills gap, lack of resources and silos. These are all common security problems, but some would argue they are felt more acutely in the SOC than anywhere else. SOC teams have been asked to do too much with too little for too long.

Google’s Anton Chuvakin and Iman Ghanizada recently coined the term autonomic security operations to describe a new kind of SOC, which aims to solve many of the challenges above and make the SOC:

"An innate defense mechanism for the organization, the same way that DevOps strives to provide this capability on the IT side."

It’s an ambitious transformation project, but there are several practical steps that can be taken and technology choices that can be made to turn this ambition into reality. Meanwhile, it’s also vital that businesses understand the potential challenges, pitfalls and who could be left behind. What follows is a quick guide to the need for an autonomic SOC, how to create one and the potential pitfalls.

What is an Autonomic SOC?

An autonomic SOC is not one single thing; it consists of several key elements relating to scale, visibility and automation. As a starting point, an autonomic SOC requires the collection of as much data/telemetry as possible, as fast as possible, from all IT & OT environments, including the network, cloud and endpoints. Many businesses struggle with this kind of scale today and often have blind spots somewhere. Or, if they do collect all this data, they suffer from alert fatigue.  

Automation is also essential because businesses cannot hope to scale up their security teams to match the modern telemetry needs outlined above. Therefore, they must embrace machine learning and the creation of human-made detection rules so that 99.99% of alerts are dealt with by software before analysts ever see them. Only then can businesses quickly determine if something is a known threat and automate response actions. Again, this is a radical departure from most current SOCs. Many SOC professionals prefer to handle most alerts themselves.

How to Create an Autonomic SOC?

Scale is the first major challenge to achieving an autonomic SOC. Inevitably, it requires cloud-scale, elastic technology – the kind of infrastructure, reliability and speed that only the likes of Google, Amazon and Microsoft can provide. SOC traditionalists have been reluctant to move to the cloud in the past, preferring to keep operations on-premises, but this is no longer practical given the rate at which telemetry is growing. The SOC cannot afford to ignore the performance and economies of scale offered by the cloud.  

Vast amounts of telemetry can lead to alert fatigue if it’s not dealt with smartly and processes aren’t automated. To introduce effective automation into the SOC, analysts must act like developers. SOCs must apply a basic software development lifecycle to their detection rules, treating them as software/code that needs to be QA’d, refined and peer-reviewed continuously.   

This requires investment in SOC analysts capable of thinking like software developers and empowering them with the right tools, processes and technology. Businesses should also be developing new analysts with an autonomic SOC mindset, training them to manage and create detection rules and enhance automation, not just handle alerts.

What Are the Pitfalls?

An autonomic SOC relies on cloud infrastructure, so naturally, cloud outages rank highly as a potential pitfall – although cloud ecosystems can be designed so that such outages are incredibly unlikely. Google’s SIEM product, Chronicle, for instance, runs on Google’s incredibly redundant Borg infrastructure. It’s the same infrastructure as Google search, which is extremely fast and resilient, meaning downtime is extremely rare.

Creating an autonomic SOC also requires the right people who are open to doing software development and letting the technology do a lot of the heavy lifting. Many current SOC pros got into the role from an IT infrastructure background and may lack the software development skills to readjust to an Autonomic SOC.

Finally, a common mistake is to dump new tools on analysts without giving them any input into the process. It’s important to let the SOC team take the lead on the autonomic SOC journey.

Final Thoughts

Ultimately the rationale behind the autonomic SOC has existed for years. There is a clear demand for change in this industry, but only recently has the technology caught up sufficiently to make it happen.

Now is the time for the SOC to break out of its silo, utilize automation, the cloud and finally realize Google’s goal of SOCs being an innate defense mechanism for organizations.    

What’s hot on Infosecurity Magazine?