Cloudflare and the Art of Owning Your Mistakes

If you happened to visit Discord, OkCupid, CoinDesk, or several other popular websites earlier this month, you might have been greeted with a 502 Gateway Error message. This wide-ranging web blackout was caused by an outage of network service provider Cloudflare. Internet users weren’t even able to check popular internet performance site DownDetector, as the site itself was downed by the outage.

The industry was quick to speculate the outage was caused by a hostile DDoS attack. Unsurprising given massive internet outages have become synonymous with DDoS attacks in recent years—a key case being the 2016 Dyn cyber-attack, where a series of DDoS attacks targeting DNS systems caused a similar major internet outage.

As for Cloudflare’s response, the company’s CEO, Matthew Prince, was quick to provide updates via Twitter. In the hours following the news, Prince confirmed the outage was caused by a massive spike in CPU usage, and quickly allayed users who presumed it was caused by an attack.

We then saw Cloudflare CTO John Graham-Cumming publish a company blog confirming the outage was caused by a single misconfigured rule within the Cloudflare firewall reacting poorly to a standard rules update, causing the CPU of the company’s machines to spike to 100%.

While internet outages are frustrating for developers and internet users alike, the transparent way in which Cloudflare handled its outage deserves serious praise. Some companies may shiver at the thought of disclosing the technical details and cause of a network outage, whether it be the potential financial implications or just sheer embarrassment.

The fact is, though, customer loyalty and trust is more likely to be earned by companies willing to be fully transparent when an issue occurs. Doing so doesn’t take away the harm and inconvenience of an outage, but it does demand respect. The positive reaction online to Cloudflare’s handing of the outage is a testament to this.

Being open also shows that companies rightly view an outage as more than just an IT issue. These situations ultimately have a wide-reaching impact on end users, and it’s only right to acknowledge these end users by involving them in the aftermath.

By having both its CEO and CTO respond to their network outage, Cloudflare successfully showed how seriously they regarded the matter. It’s also worth noting that Cloudflare isn’t bound to disclose security breaches the way European companies are. Despite this, they still provided clear statements—truly leading by example.

While Cloudflare’s response was commendable, the causes of the outage should still be assessed. The company has already admitted its testing process before the downtime was insufficient and it’s now looking to improve these processes. This is a welcome step; constant testing is a must in ensuring networks are completely secure. It’s only through testing that network vulnerabilities and misconfigured rules are uncovered and addressed.

The outage also reinforces a message all IT pros should already be familiar with: network monitoring is just as important as establishing network defenses. While defending against external threats should be a priority for IT pros, so should the monitoring of networks with the correct tools and software.

A lot can be learned from the recent Cloudflare episode. Approaching the fallout of an outage in a transparent and conscious way is something all companies should aspire toward. The outage also demonstrates the damage internal IT errors can inflict. Cloudflare had the strong network visibility needed to quickly locate and address the cause of their error—not all IT pros will have this visibility. If there was ever a call to action for network monitoring, this is it.

Cloudflare and the Art of Owning Your Mistakes

Sascha Giese

You may also like

Global DDoS Attack Dismissed as T-Mobile Misconfiguration

DDoS Attack Triggers New Microsoft Global Outage

Dutch Government Websites Floored by Day-Long DDoS

DDoS Attack Volume and Magnitude Continues to Soar

DDoS Disrupts Japanese Mobile Giant Docomo

What’s Hot on Infosecurity Magazine?

Single Prompt Enables ChatGPT to Execute Full Cyber-Attack Chain, Researchers Claim

Compromised Logins Surge as the Most Common Entry Point for Ransomware Attacks

Government Updates UK’s National Risk Register with Cyber Warnings

New AI Security Charter Backed by Over 70 Cyber Firms

Google Cloud's New CISO Chris Betz on Integrating AI in Cyber Defenses

Researchers Claim First Fully Agentic Ransomware: JadePuffer

Cybersecurity’s Economics Are Broken. Automation Alone Won’t Fix It

Compromised Logins Surge as the Most Common Entry Point for Ransomware Attacks

Novel OAuth Client ID Spoofing Technique Targets Cloud Environments

US: Pentagon Suspends CMMC Phase II Requirements for Defense Contractors

New AI Security Charter Backed by Over 70 Cyber Firms

Single Prompt Enables ChatGPT to Execute Full Cyber-Attack Chain, Researchers Claim

68% of Businesses Say Employees Are Their Biggest Cyber Threat. Now What?

How to Manage Enterprise Cyber Resilience in the Age of AI

Financial Services Cyber Resilience: Stress Testing Third Parties Before Attackers Do

Why Resilience‑Focused Cloud Design Is Your Best Defense Against Modern Attacks

How To Enhance Security Operations with AI-Powered Defenses

Behind the Curtain of Microsoft 365 Cybersecurity: Lessons from Overlooked Resilience Gaps

How Faster Cyber-Attacks Are Reshaping Enterprise Cybersecurity Strategies

Researchers Claim First Fully Agentic Ransomware: JadePuffer

AI is Already Powering Cyber-Attacks. Can it Power Cyber Defense?

Google Cloud's New CISO Chris Betz on Integrating AI in Cyber Defenses

How World Cup Password Trends Can Increase Active Directory Risk

New CISA Guide Helps Agencies Adopt SASE For Zero Trust

Cloudflare and the Art of Owning Your Mistakes

Written by

You may also like

What’s Hot on Infosecurity Magazine?