Facebook Builds its Own Threat Information Framework

ThreatData is a framework for collating information on internet threats and making it accessible for both real-time defensive systems and long-term analysis. It’s a bespoke effort comprised of three high-level parts: feeds, data storage and real-time response.

“When we began sketching out a system to solve this problem, we encountered issues others have faced: every company or vendor uses their own data formats, a consistent vocabulary is rare and each threat type can look very different from the next,” said Facebook security staffer Mark Hammell, in a blog. “With that in mind, we set about building what we now call ThreatData.”

Feeds first of all collect data from various sources and are implemented via a light-weight interface. The data can be in nearly any format and is transformed by the feed into a simple schema that the company calls the ThreatDatum. To build the database, Facebook is using feeds from VirusTotal, malicious URLs from multiple open source blogs and malware tracking sites; vendor-generated threat intelligence we purchase; Facebook's own internal sources of threat intelligence; and browser extensions for importing data as a Facebook security team member reads an article blog, or other content.

Once a feed has transformed the raw data, it is fed into two existing data repository technologies: Hive for long-term data analysis and Scuba short-term. Hive storage answers questions like, “Have we ever seen this threat before?” and “What type of threat is more prevalent from our perspective: malware or phishing?” Scuba meanwhile offers the opposite end of the analysis spectrum, answering questions like, “What new malware are we seeing today?” and “Where are most of the new phishing sites?”

The last piece of the puzzle is making all of that data actionable. Facebook uses a homegrown processor to examine ThreatDatum at the time of logging, to act on each of these new threats.

For instance, all malicious URLs collected from any feed are sent to the same blacklist used to protect people on facebook.com; interesting malware file hashes are automatically downloaded from known malware repositories, store, and sent for automated analysis; and threat data is propagated to its internal security event management system, which is used to protect Facebook's corporate networks.

So far, the social network said that is logging successes with the initiative.

“Now that we have the ThreatData framework in place, we continue to iterate on it, more Facebook engineers are hacking on it, and we are bringing in new types of threats,” Hammell said.

For instance, it has been able to uncover a spam campaign using fake Facebook accounts to send links to malware designed for feature phones. The malware is capable of stealing a victim's address book, sending premium SMS spam and using the phone's camera to take pictures. The framework allowed Facebook to analyze the malware, disrupt the spam campaign and work with partners to disrupt the botnet's infrastructure.

It is also using the framework to beef up its anti-virus posture, by feeding in hashes to the custom security event management system that are expressly not detected by its third-party anti-virus product. Hammell said that as a result, it’s been able to detect both adware and malware installed on visiting vendor computers that no single anti-virus product could have found.

Facebook is also adding additional context to the data as it goes on, including Autonomous System, ISP and country-level geocoding on every malicious or victimized IP address logged to the repository. As a result, it can understand where threats are coming from, arranged by type of attack, time and frequency.

“Discoveries and detection capabilities like these are just the tip of the iceberg,” Hammell said. “We've found that the framework lets us easily incorporate fresh types of data and quickly hook into new and existing internal systems, regardless of their technology stack or how they conceptualize threats.”

Facebook Builds its Own Threat Information Framework

You may also like

Exploit Kits and Redirection Anchor the Data Theft 'Kill Chain'

Red October cyber-espionage campaign used highly sophisticated infiltration techniques

Facebook, FBI team up to crack botnet ring

Goofing off at Work Can Lead to Malware Infections and Data Breaches

Keeping sensitive information secure when staff is leaving

What’s Hot on Infosecurity Magazine?

Ubuntu snap-confine Vulnerability Enables Local Root Access

Google Makes CodeMender Available as Managed AI Security Agent

Researchers Build WordPress Exploit Using OpenAI's GPT

Ferrari Cybersecurity Head on Defending Formula 1’s Most Iconic Team

CISA Mandates Urgent Patch for Actively Exploited Critical Fortinet Vulnerabilities

macOS Flaw Lets Standard Users Disable EDR and MDM

Cybersecurity’s Economics Are Broken. Automation Alone Won’t Fix It

Single Prompt Enables ChatGPT to Execute Full Cyber-Attack Chain, Researchers Claim

Researchers Build WordPress Exploit Using OpenAI's GPT

New AI Security Charter Backed by Over 70 Cyber Firms

JadePuffer Returns With Ransomware Designed to Wipe AI Models

Compromised Logins Surge as the Most Common Entry Point for Ransomware Attacks

68% of Businesses Say Employees Are Their Biggest Cyber Threat. Now What?

Same Front Door, New Visitors: Securing Humans and AI Agents at the Browser

How to Manage Enterprise Cyber Resilience in the Age of AI

Why Resilience‑Focused Cloud Design Is Your Best Defense Against Modern Attacks

Financial Services Cyber Resilience: Stress Testing Third Parties Before Attackers Do

Securing M365 Data and Identity Systems Against Modern Adversaries

How Faster Cyber-Attacks Are Reshaping Enterprise Cybersecurity Strategies

Researchers Claim First Fully Agentic Ransomware: JadePuffer

AI is Already Powering Cyber-Attacks. Can it Power Cyber Defense?

Google Cloud's New CISO Chris Betz on Integrating AI in Cyber Defenses

How World Cup Password Trends Can Increase Active Directory Risk

New CISA Guide Helps Agencies Adopt SASE For Zero Trust