Share

Related Links

Related Stories

  • MIT projects raise privacy questions
    Two experiments conducted at MIT are raising questions about the level of privacy among those who use modern tools such as mobile phones and social networks - and suggesting that there is even less of it than most of us already thought.
  • EPIC: Obama must try harder on electronic privacy
    Eight months into its first year, the Obama administration could still try harder when it comes to electronic privacy and digital rights, according to a report card issued by an advocacy group.
  • Privacy rankings: LinkedIn and Bebo high, Facebook and MySpace average, Badoo low
    Cambridge academics have revealed that social networks that promote their security controls are likely to deter users from joining, and as a result privacy guidelines are inaccessible.
  • Organizations Counsel New President on Privacy Issues
    President Obama has yet another set of technological recommendations to mull over following his inauguration today. The National Institute of Standards and Technology (NIST) published a draft set of recommendations for protecting personal information, while the Future of Privacy Forum (FPF) provided its own list of requirements for protecting consumer privacy.
  • Netflix cans anonymous data contest
    DVD rental company Netflix has quietly cancelled a sequel to its Netflix Prize, a contest to enhance its movie recommendation technology using anonymous user data.

Top 5 Stories

News

Netflix' second data challenge on revealing customers DVD rental habits has privacy experts hopping mad

29 September 2009

Privacy advocates are furious at plans by DVD rental service Netflix to unveil more data about the rental habits of its customers. Experts argue that the data could easily be used to identify customers and draw inferences about their lifestyles.

Last week, Netflix announced that it had awarded the prize for its first US$1 million Netflix challenge. The company had made rental data available to the public, asking for algorithms that would help Netflix make its movie recommendation system more accurate. The original challenge focused on improving the recommendation system for those rental customers who had already rated large numbers of films using the Netflix website.

The company simultaneously announced a second challenge, with the same prize, this time focusing on improving the recommendation system for those customers who don't rate movies often, or at all. To do this, it said that it would take advantage of demographic and behavioural data "carrying implicit signals about the individuals' taste profiles".

The new data set includes information about customer age, gender, zip code, genre ratings, and previously chosen movies. "As with the first Netflix prize, all data provided is anonymous and cannot be associated with a specific Netflix member", it said.

However, experts argue that the ability to identify customers using the anonymous data provided by Netflix has already been proven. Paul Ohm, associate professor of law and telecommunications at the University of Colorado law school, argued in a paper published this August that Netflix' attempt to anonymize the data in its first challenge was fatally flawed. Researchers from the University of Texas, Arvind Narayanan and Professor Vitaly Shmatikov, found that it was easy to identify individuals within the data set with a high degree of probability with just a little outside knowledge about their movie watching preferences, he warned.

Ohm praised Netflix for at least trying to consult with experts when releasing the data for its first challenge, but expressed concern over the second one. "Netflix should cancel this new, irresponsible contest", he warned. "Researchers have known for more than a decade that gender plus zip code plus birthdate uniquely identifies a significant percentage of Americans."

Although being sent Ohm's comments, Netflix staff stuck to the company line. "The information we’re giving in the Netflix Prize 2 dataset is completely anonymous. It contains no personally identifiable information. It does not contain anyone’s name, address, or any means to connect a particular record with a specific Netflix member", said the company in a statement to Infosecurity magazine.

"As in Netflix Prize 1, the dataset contains some movie ratings from select anonymous members. It also includes some queue adds and taste preferences, broad age ranges, gender and zip codes but, again, completely anonymous. All that data is modified – our scientists call it perturbed – to make it anonymous."

This article is featured in:
Compliance and Policy  • Internet and Network Security

 

Comment on this article

You must be registered and logged in to leave a comment about this article.