The Facebook Outage and the Case for Cyber Resilience

Early reports of the Facebook outage were quick to include the comment that the outage was not due to a cyber-attack, the implication being that it was somewhat less worrisome than if it were. Millions of users, including small businesses that rely on Facebook services for their daily operations, and people in parts of the world in which these services are the primary means of reliable communication, were cut off from a vital resource. It was just a mistake, no need for concern. Facebook is protected from attacks. Yet, the impact of this mistake was real and widely felt, arguably more widely felt than most of the cyber-attacks that gain so much attention. This raises an interesting question: are we worried about the wrong thing?

The crime-fighting attack-and-defend language of cybersecurity has directed our attention toward addressing the malicious actions that lead to cyber-breaches, data losses and denial of service rather than addressing the consequences of those actions. Instead, we should make sure our systems stay operational or can quickly return to operational health and maintain our data integrity, regardless of the form of attack. In fact, when looked at in terms of system impact, an attack is the same as a mistake, power outage or earthquake. Actions taken to protect against the consequences of an attack can also address the recovery from these other disruptions. This is cyber-resilience.

Cyber-resilience is a company’s ability to minimize impact and recover if systems or data have been compromised. Cyber-resilience covers adversarial threats such as hackers and other malicious actors and non-adversarial threats such as human error, natural disaster or failures in interrelated systems. Regardless of the cause of the problem, resilient protections minimize the effect.

"Cyber-resilience is a company's ability to minimize impact and recover if systems or data have been compromised"

To be sure, one dimension of resilience lies in protecting against particular causes of failure and cause-specific solutions are needed for this protection. So, for example, you use a surge protector to guard against system threats from lightning and a virus checker to guard against the threat from some cyber-attacks. (To be fair, Facebook had some audit functions in place that were meant to protect against error though they proved inadequate.) However, the impact minimization and recovery aspects of resilience can generally address outages regardless of cause. Yet, this dimension of the problem, which is boring system design and management stuff, typically receives less attention and less corporate investment than the more manly and exciting attack response provided by cybersecurity mechanisms.

Some of this disparity can be attributed to the way enterprises are organized. Generally, the resilience features of availability, reliability and recovery are the purview of the network or infrastructure departments, while vulnerability to attack is the domain of the security department. Departments often compete for funds, and requirements between departments are often thrown over the wall with little concern for their impact on other departments. In this way, we institutionalize the classic system engineering problem of not considering all concerns jointly. There was a telling statement at the end of Facebook’s October 5 blog posting on the cause of the network outage:

“We’ve done extensive work hardening our systems to prevent unauthorized access, and it was interesting to see how that hardening slowed us down as we tried to recover from an outage caused not by malicious activity, but an error of our own making. I believe a trade-off like this is worth it – greatly increased day-to-day security vs. a slower recovery from a hopefully rare event like this.”

I think it is worth discussing whether the trade-off between security and recovery is even necessary, let alone worth it. Perhaps our artificial separation of cybersecurity into a thing in itself rather than an aspect of a unified response to risk is forcing us to make bad bargains.

The Facebook Outage and the Case for Cyber-Resilience

Patricia Muoio

You may also like

18 Oil and Gas Companies Take Cyber Resilience Pledge

Facebook Blames Global Outage on Configuration Error

Facebook Fixed, but Did Outage Hurt Trust?

Jaguar Land Rover Extends Production Pause Again

NCSC Updates Cyber Assessment Framework to Build UK CNI Resilience

What’s Hot on Infosecurity Magazine?

Novel OAuth Client ID Spoofing Technique Targets Cloud Environments

TrickBot Ditches HTTP for DNS Tunneling in Latest Variant

Ubuntu snap-confine Vulnerability Enables Local Root Access

FBI Warns of Deepfake Videos Impersonating IC3 Leadership

PKI Under Pressure: AI, PQC and Shorter Lifecycles Drive New Security Challenges

Where Organizations Fall Short with MFA

Open AI Claims Its AI Models Went Rogue and Hacked Another Company

Same Front Door, New Visitors: Securing Humans and AI Agents at the Browser

Ferrari Cybersecurity Head on Defending Formula 1’s Most Iconic Team

Researchers Build WordPress Exploit Using OpenAI's GPT

New Dolphin X Stealer Employs AI Profiling to Prioritize Targets

FBI Warns of Deepfake Videos Impersonating IC3 Leadership

Same Front Door, New Visitors: Securing Humans and AI Agents at the Browser

68% of Businesses Say Employees Are Their Biggest Cyber Threat. Now What?

How to Manage Enterprise Cyber Resilience in the Age of AI

Financial Services Cyber Resilience: Stress Testing Third Parties Before Attackers Do

Behind the Curtain of Microsoft 365 Cybersecurity: Lessons from Overlooked Resilience Gaps

How To Enhance Security Operations with AI-Powered Defenses

How Faster Cyber-Attacks Are Reshaping Enterprise Cybersecurity Strategies

Researchers Claim First Fully Agentic Ransomware: JadePuffer

AI is Already Powering Cyber-Attacks. Can it Power Cyber Defense?

Google Cloud's New CISO Chris Betz on Integrating AI in Cyber Defenses

How World Cup Password Trends Can Increase Active Directory Risk

New CISA Guide Helps Agencies Adopt SASE For Zero Trust

The Facebook Outage and the Case for Cyber-Resilience

Written by

You may also like

What’s Hot on Infosecurity Magazine?