Building Privacy Into AI: Is the Future Federated?

Written by

The changing dynamics of the digital world have led to several privacy challenges for businesses, large and small. This is placing increasing pressure on them to evolve their processes and strategies. Much of the burden stems from the sheer volume of data present today, and in fact, the volume of data is predicted to balloon to 175 zettabytes (ZB) by 2025. Today, it is simply beyond human capability to effectively process and protect privacy without the assistance of privacy-enhancing technologies (PETs).

This has led to an explosion of adaptive machine learning (ML) algorithms that can wade through the mountain of data while continuously and efficiently changing their behavior in real-time as new data streams are fed into them. However, while ML is key to leveraging and learning from big data at scale, it can create privacy challenges. In fact, traditional ML requires data to be stored on a centralized server for analysis, including transporting data to cloud environments; this opens the doors to a plethora of security and privacy implications.

Taking It to the Edge

These privacy and security concerns have led the charge for ML technology that can work in a way that preserves consumer privacy, which is why federated learning (FL) has gained such momentum. Federated learning, put simply, is a decentralized form of machine learning. It is a method of training an algorithm on user data across multiple decentralized edge devices or servers without exchanging or transferring that data to a central location.

With decentralized, federated learning, a global model is generated in a central server, and the data to train this model is distributed across edge devices. The data stays with the owner while still being used to create insights centrally. If you will, bringing the mountain to Muhammad, the model is brought to the data where it can be trained/updated rather than the data having to go to the model. Federated learning is one of the best examples of the new breed of edge computing, where computation and data storage are brought closer to the data source.

Looking to support a privacy-first future for web advertising and protect its biggest revenue stream, Google is a leading proponent of the technology and has recently launched its Federated Learning of Cohorts (FLoC) as a replacement for traditional third-party cookies, which it plans to stop supporting by 2023.

Spreading the Load

While there are multiple benefits to federated learning, it does have certain limitations. Not least of which is the fact that the technology requires frequent communication between the nodes during the learning process to work. Thus, it requires enough local computing power and memory, which might affect user experience and the user’s bandwidth to exchange parameters of the machine learning model in real-time.

Luckily, with the emergence of technologies such as 5G, today’s communications infrastructure is more than robust enough to handle this. Plus, the edge devices that technologies such as Google’s FLoC are typically talking to tend to be powerful mobile phones with several gigabytes of memory. This means that specific technical barriers to federated learning have been all but removed.

Plugging the Gaps

Because federated learning enables multiple actors to build a standard, robust ML model without sharing data, it addresses critical issues for ML, such as data security, data access rights, and access to heterogeneous data. However, while it will undoubtedly become an essential part of the modern marketing technology stack, federated learning must be implemented carefully. Even though such a technique is, by its very nature, a leap forward in data privacy, it is still imperative that privacy-by-design principles are observed at all times.

Privacy-by-design proactively embeds privacy into the design and operation of IT systems, networked infrastructure, and business practices. When federated learning is paired with other privacy mechanisms – such as secure multi-party computation, differential privacy and quantitative measurement – privacy risks be considered addressed. Therefore, federated learning is a case of plugging the gaps to ensure you remain compliant with increasingly stringent privacy regulations.

A Challenge of the Digital Era

As the global wave of privacy legislation and privacy activism continues to accelerate the need for privacy-preserving techniques, federated learning is one worthy example of the progress being made to mould a data-led economy underpinned by privacy. The adoption of privacy-enhancing technologies is transforming the way businesses approach challenges with compliance and the way they overcome operational inefficiencies and accelerate data-driven strategies or further evolve AI initiatives. Rather than looking at data privacy as a business blocker, those embracing a privacy-by-design ethos understand that protecting privacy is a gateway to a greater depth of insights that can fuel growth and power innovation.

What’s hot on Infosecurity Magazine?