Our website uses cookies

Cookies enable us to provide the best experience possible and help us understand how visitors use our website. By browsing Infosecurity Magazine, you agree to our use of cookies.

Okay, I understand Learn more

#Infosec17: Effective Machine Learning - Know Your Data & Where it Comes From

Speaking at Infosecurity Europe 2017 Oliver Tavakoli, CTO of Vectra Networks, said that, when implementing machine learning, it’s important to fully understand your data and where it has come from, along with knowing what the technology is trying to achieve.

In his session ‘Defeating & Abusing Machine Learning-based Detection Technologies’ Tavakoli explained that a problem in information security when it comes to machine learning is that attackers can use techniques to attack and pollute your data, which can then effect the algorithm’s ability to draw correct and accurate conclusions, resulting in flawed results.

“With machine learning, what you are doing is having the data train the program, so there’s not a human now involved in writing the logic, so the data can train the algorithm but it can also fool the algorithm. You have to be really careful about where you get your data from.”

What you really want from machine learning, he added, is to know whether the systems that you have are giving a feedback loop to the attacker so they can see whether something has succeeded or failed, or whether they are just raising silent alarms and being online or causing any kind of change in action.

Therefore, said Tavakoli, there are some key questions that you should be asking when looking to acquire and use machine-based learning services.

Firstly, if you are purchasing the end-result of other people’s machine learning, you need to be asking what data is being used as input, where it’s from and how it’s not polluted.

Secondly, if you are building your own machine learning platform, you need to be asking how good your data science team is, how you ensure the data acquisition process has integrity and whether data includes the right features to detect use cases you care about.

However, to conclude, Tavakoli added: “Is your team capable of carrying the load the rest of the way?”—it’s about ambiguity—as “black and white only gets you so far.”

What’s Hot on Infosecurity Magazine?