AI and Data Privacy: Compatible, or at Odds?

Written by

Data privacy is a controversial topic today, and as new regulations emerge globally to protect consumer data, that’s not likely to change. The GDPR in Europe and the CCPA in California are setting new standards for protecting consumer data and giving consumers more rights over the collection and use of their personal information.

At the same time, they’re raising awareness among consumers about how their data is being used, sometimes without their knowledge.

A parallel trend that’s some believe is at odds with data privacy is Artificial Intelligence (AI). AI is increasingly being used to in our daily lives, and its usefulness depends on the three “Vs” of big data—volume, variety and velocity. Curtailing access to data can impact AI’s effectiveness. 

Influential leaders such as Alphabet CEO Sundar Pichai have expressed their belief that AI must be regulated, but it’s important to take into consideration not only the concerns around AI but the benefits it offers. The ultimate question we have to address is, how do we leverage the best of what AI has to offer without compromising data privacy? 

Why Is Consumer Trust Broken?
In 2020, there will be 40 trillion gigabytes of data, and every person will generate 1.7 megabytes in just one second. Privacy advocates often clash with technology vendors and marketers. The vast majority of businesses — 97.2%—are spending money to use AI to hone marketing outreach, segment target audiences, tailor advertising and offers, and generate relevant content and experiences that lead to high conversion rates. Consumers know this, and they’re reacting with mixed emotions. These emotions evolve around one core requirement: Trust.

Consumers also know that organizations may share or sell data without consumers’ knowledge, and high-profile violations of trust such as the Cambridge Analytica scandal exacerbate the mistrust that consumers feel toward the organizations acquiring and using their data. 

Building Trust with Value
When it comes to providing our personal data, consumers usually have two fundamental prerequisites for trusting the process: 

  1. They receive meaningful value in exchange for sharing their personal data.
  2. They understand how their data will be protected and that it will only be used for the stated purpose. 

The level of understanding around the value exchange varies depending on demographics. A consumer’s age, nationality, culture or circumstance all impact on how they view privacy and the boundaries they set around sharing data. But the majority of consumers understand that to access the full functionality of their apps and services requires sharing some amount of personal information. 

For example, without access to location data, the GPS functionality on a smartphone won’t work. If you don’t provide your routing and account numbers to PayPal or Venmo, you can’t electronically transfer money. Ordering goods on Amazon requires a delivery address, phone number and other personal information. Consumers are willing to provide this data, because of the perceived value of the exchange and because they trust the organization will deliver on the promise of that value.

With AI, the value exchange isn’t always obvious, and when it’s not, trust can break down. Are there ways to leverage datasets to inform AI-powered technologies without compromising data privacy—or consumer trust?

Data for AI: Less is More
Several emerging technologies enable the sharing of large datasets with anonymity:

  • Differential privacy systems focus on ways to share datasets of personal information without exposing recognizable details about the individuals to which the data belongs. In this way, it’s possible to derive valuable insights from data without compromising data privacy. 
  • Multi-party computation is another advanced approach to using personal data in a way that keeps certain aspects of that data concealed. 
  • Zero-factor authorization offers a solution that involves building a digital DNA for an individual using data from online behaviors. This new authentication paradigm can be highly effective and secure without compromising privacy, by adhering to robust data collection, continuous data analysis and total transparency.
  • Unsupervised machine learning can accurately monitor user account behavior, without the need to acquire extensive personal data about individual users. By deploying sophisticated graphing and clustering techniques, UML enables the observation and detection of known and unknown patterns, uncovering connections that help determine with a high degree of accuracy whether a given action or set of actions is legitimate or fraudulent. UML is similar to the differential privacy approach, in that it can draw conclusions from data about aggregated user actions and behaviors, without having to violate the privacy of individual users.  

Such technologies are making it possible to embrace a “less is more” approach when it comes to collecting and using data, yielding powerful results for organizations without having the adverse effect of eroding consumer trust.

Restoring Trust in the Digital Age
The path forward for restoring consumer trust requires full transparency as to what data organizations collect and how they use it. Using transformational technologies and techniques such as unsupervised machine learning, we can leverage AI to a greater extent for the benefit of consumers and the world at large, while keeping data privacy intact.

What’s hot on Infosecurity Magazine?