Every Single American Household Exposed in Massive Leak

Written by

Yet another cloud storage misconfiguration has exposed personally identifiable information (PII), according to researchers—this time affecting 123 million Americans across billions of data points.

The scale of the issue puts it in the running with the infamous Equifax incident, as it touches virtually every American household.

The UpGuard Cyber Risk Team found a cloud-based data repository containing data from Alteryx, a California-based data analytics firm. Exposed within the repository are massive data sets belonging to Alteryx partners Experian, the consumer credit reporting agency that competes with Equifax, providing full data sets for both Experian’s ConsumerView marketing database, public information from the 2010 US Census, and other collected data sets. 

“From home addresses and contact information, to mortgage ownership and financial histories, to very specific analysis of purchasing behavior, the exposed data constitutes a remarkably invasive glimpse into the lives of American consumers,” said UpGuard researchers Chris Vickery and Dan O'Sullivan, in an analysis.

At issue is once again an Amazon Web Services S3 cloud storage bucket that was misconfigured and inadvertently left open to the public internet, where anyone with a connection online could have found it.

“While, in the words of Experian, ‘protecting consumers is our top priority,’ the accumulation of this data in ‘compliance with legal guidelines,’ only to then see it left downloadable on the public internet, exposes affected consumers to large-scale misuse of their information.”

That misuse could mean simple spamming and unwanted direct marketing, or more nefarious effects such as organized fraud techniques like phantom debt collection, identity theft and security verification.

Collecting data for marketing purposes has become commonplace, as serving advertising and brand messages is one of the largest pieces of the American economy, responsible for monetizing content, social media and more. Also, some of the information in the trove comes from publicly available sources: The US Census data included for instance was already publicly available at census.gov, including already published data from the 2010 Census. "The company implicated had no access to PII collected by the Census Bureau, nor did the reported data leak involve Census Bureau servers or Census Bureau data stored through cloud services," the US Census Bureau stressed.

However, when public information like this is combined with other information, and more and more collation takes place, across many different sources, these databases become fingerprints—troves of information that provide a startlingly complete picture of each individual profiled.

“The continuing concentration of data by a number of large enterprises, now wielding powerful technology of the sort provided by Alteryx, has not been accompanied by greater prudence and process improvement necessary to ensure that the data will remain securely stored,” the researchers said. “The result has been, in the same way warming waters increase the power of hurricanes, that data exposures such as this are capable of exposing the vast majority of American households to compromise with one error.”

The data spans a wide variety of specific personal information, including age, gender, education, occupation and marital status, along with “lifestyle and interest data” (profiles are grouped in headings like “cruise enthusiasts” and “domestic traveler”) plus “financial indicators, including card usage and creditworthiness.”

The incident also reveals how third-party vendor risk is getting out of control, they added.

“The exposure of massive amounts of data from three different enterprises in one cloud leak—including from a federal agency—reveals how the consequences of cyber insecurity can, in an increasingly interdependent technological environment, quickly afflict partners and expose their data as well.”

The database has now been safeguarded, and it remains to be seen what information has fallen into bad actors’ hands. Alteryx and Experian downplayed the incident:

"Specifically, this file held marketing data, including aggregated and de-identified information based on models and estimations provided by a third-party content provider, and was made available to our customers who purchased and used this data for analytic purposes,” the former said in a statement to Forbes. “The information in the file does not pose a risk of identity theft to any consumers."

Experian added a pass-the-buck aspect to its response: "This is an Alteryx issue, and does not involve any Experian systems. Alteryx has already confirmed with you that the data in question contained no names of any individuals or any other personal identifying information, and does not pose any risk of identity theft to any consumers. We have been assured by Alteryx that they promptly remedied this issue."

What’s hot on Infosecurity Magazine?