However, in the era of mobile devices and location-based marketing, privacy and physical tracking has become a concern. And one researcher says that the quick fix offered up by the industry – hashing and cryptography – is actually only a shell of a solution, and he proved it.
Last July, news surfaced that major national retail chains were testing technology that would allow them to automatically track shoppers’ location through stores via mobile devices. After expressing deep concern, US Sen. Chuck Schumer (D-N.Y.), the Future of Privacy Forum (FPF) and a group of location analytics companies – including Euclid, iInside (a WirelessWERX company), Mexia Interactive, SOLOMO, Radius Networks, Brickstream and Turnstyle Solutions – announced that they had agreed to a Code of Conduct to promote consumer privacy and responsible data use for retail location analytics.
The code calls for in-store posted signs that alert shoppers that tracking technology is being used, and instructions for how to opt out. And, crucially, it involves putting data protection standards in place to de-identify data from specific users via a crypto algorithm.
“This is a significant step forward in the quest for consumer privacy,” said Schumer at the time. “This agreement shows that technology companies, retailers and consumer advocates can work together in the best interest of the consumer.”
Unfortunately, it turns out that the cryptography method in use can be trivially undone.
“Your smartphone includes Wi-Fi and Bluetooth chips, and those chips each have a unique serial number, called a MAC address,” explained Jonathan Mayer, a grad student at Stanford University, in a blog post. “Periodically your phone will announce itself, including those MAC addresses. The most common approach to retail analytics simply logs these broadcasts and compiles [shopper activity into a database].”
Under the Code of Conduct, before a MAC address is saved, it gets passed through a cryptographic hash function that produces a random number for each MAC address. That has the effect, supposedly, of anonymizing the data. “Hashed data cannot be reverse-engineered by a third party to reveal a device’s MAC address,” explained Euclid in its privacy statement. “This means that anyone who gains access to the database . . . would see only long strings of numbers and letters. They would be unable to get any information that could be linked to a back to a particular mobile device owner.”
But Mayer said that if someone wants to learn the shopping history associated with a particular MAC address, they can simply apply the hash function, then look up the hash in the database.
“Hashing is...no defense against re-identification,” he said. “In under an hour, and for less than a dollar, I built a cloud system that reverses hashed MAC addresses.”
He explained, “I rented a server with a graphics card in Amazon’s cloud. Hashing involves parallelized math, so a graphics card gives a substantial performance boost. Next, I installed oclHashcat, a fast hash-checking program. Writing format files for hashcat took just a few minutes. Then, with no effort at optimization, I tossed in the hash of my smartphone’s MAC address. Reversing the hash took just 12 minutes. Total cost: $0.65, plus tax.”
Instead, he said, salted hashing and hash-based message authentication codes (HMAC) integrate extra information in the course of hashing, frustrating attacks that rely on pre-computed hash values. But there are issues, even with this.
“They do not…protect against attacks that involve actually computing hashes,” Mayer explained. “If an employee or intruder has access to hashed MAC addresses, they presumptively also have access to the extra information. Salting and HMAC are no solution here.”
If shoppers ran client-side apps that they opted into on their smartphones, provided by the store, there would be viable privacy-preserving approaches to retail analytics, he said. But short of turning off Wi-Fi and Bluetooth while out and about, most people are at risk of unauthorized tracking.
“The appeal of Wi-Fi and Bluetooth, of course, is that shoppers are automatically included,” he said.