Interview: Misha Govshteyn, CSO and Co-Founder, Alert Logic

“For every $1 spent on securing web applications, $23 is spent protecting everything else,” says Misha Govshteyn, co-founder and CSO of Alert Logic. “The ratio seems off.”

Sitting with Govshteyn, I had just finished an Infosecurity webinar on machine learning where he asked me the traction in this sector – and points out that he does not subscribe to the world of a replacement for anti-virus, and that "we’ll probably get to the point where we have self-driving cars before security for applications."

He added that having hired people to do a data science job some five years ago, it is only now that set of appointments is proving fruitful as they have the data sets to work with.

“The most robust applications require training, as there is a supervised machine learning model but you need to be trained and have continuous feedback and you need to generate results, own them and continue to refine the training,” he said. “Think about how most of the products are sold right now; most of the vendors don’t know what passes through them and they don’t see the data or a training set.”

Govshteyn argued that is part of the reason why self-driving cars have taken off, as you can teach it once on human activity, while a data center ingests 1 PB of data every month and stores 14 PB for users, but servers do not read email so that is why he felt the concept of machine learning has taken off.

On a recent edition of the Down the Security Rabbithole podcast, CrowdStrike chief scientist Dr Sven Krasser detailed the benefits and pitfalls of machine learning, namely that it requires data scientists and that there is a benefit to anti-virus software as you "do not need to update anti-virus signatures every day, but the problem is you are exposed between signatures", and with proper categorization of data, you have the ability to get answers from your data sets.

“Machine learning is more around making predictions, to look at data and try and find structure in that data and try to find labels for data,” Krasser said. “One thing people ask me ‘is this Skynet in the security space?’ but that is what is called an AGI (artificial general intelligence) that is probably still some time off, so don’t worry about your security software locking you out of your house!”

Onto the benefits, Krasser said that machine learning for anti-malware is that once you have trained your machine learning model to know what malware looks like, "you don’t need to update your signatures every day. The problem is you’re exposed from the last signature update to the next, and that can be 24 hours,” he said.

“For machine learning, for the models we are training, we keep them around for a month or something and they would be good for longer, but at some point you want to re-train them so that they capture the redistribution of files that you see in the field, and that is one of the appeals – you have something that knows what malware looks like without looking for specific things in the malware files.”

Govshteyn said that when Alert Logic started its machine learning efforts, it was achieving a 76% accuracy rate, but within a couple of months it rose to 96%. “We don’t see it as a replacement for signatures, we use a lot of the input of the signatures as an output so it lets you collect, so we collected 210 incidents. What machine learning has done is allow us to correlate in a different way and when we look back at the indicators, about 60% were detected previously as it was part of a campaign so we escalated it to the customers so they are aware of it.

“It allows you to build a story for a customer so we can tell them that someone has been stalking them out and they need to be aware of this, so they need to fix things as the campaign does not seem to be going away.”

Govshteyn added that many companies are not doing machine learning, but of those who are, he predicted more of a move towards securing cloud applications specifically, and purpose built solutions. This will also provide more opportunities for data scientists and security analysts, as at the moment "they are lost and their conclusions are way off as they are looking at data in abstracts", and the components are critical as otherwise, they don’t know.

To conclude, I asked Govshteyn if customers come to Alert Logic and ask for machine learning, or about it? He said that is not a question they ever get asked, or someone would ask, but a lot of clients will say ‘I’ll buy your service but are you going to protect me from everything?’.

“The reality is if the attack takes three months to execute and they come from different parts of the internet, and it is complex enough to generate hundreds of thousands of events, I know my human analyst cannot analyze those manually, so the answer has to be machine learning as what else can we do? If traditional correlation was going to work, we would have done it by now so when we face really high level of expectations, the more logical answer is we are going to do as much as we can with signatures, do as much as we can with humans – but where they stop is where we turn to machine learning.”

What’s Hot on Infosecurity Magazine?