Web Scraping, Sweden, and Detection vs. Prevention

Written by

On Friday I met with Mathias Elvang, head of consulting services at Stockholm-based security consultancy firm, Sentor.

We touched on the information security market in Sweden – by the sounds of it, it’s pretty small – and then the issue of web scraping.

Estimates show that up to 60% of visitors to websites are actually web scrapers harvesting their data for commercial and/or competitive purposes, which Elvang described as “an identified problem which people couldn’t stop”.

Wikipedia describes web scraping as “ a computer software technique of extracting information from websites. Usually, such software programs simulate human exploration of the World Wide Web by either implementing low-level Hypertext Transfer Protocol (HTTP), or embedding a fully-fledged web browser, such as Internet Explorer or Mozilla Firefox.”

“You can buy scraping services from India and places”, Elvang advised. “When we blacklist scrapers, they change their tactics. We block one million IP addresses at any one time through behavior analysis”, he told me.

While Sentor consider web scraping to be one of the most serious information security concerns (although of course this is not surprising given their web scraping tool, scrape sentry), they admit it’s an issue given very little airtime. “People don’t understand the risk. They have no idea about the extent of the problem – it’s an awareness issue”, he said.

While the technology is based on a white-listing, Elvang is keen to emphasise that false positives are unacceptable. “It would be unacceptable to block a genuine customer – it’s imperative that we enable the business”.

Lastly, we discussed prevention versus detection, which is a debate I’m hearing more about in infosec circles. “We do digital surveillance. Ninety percent of our effort is focused on detection, 10% on prevention”, he told me. Prevention, he said, is the job of the client themselves. “If we find a problem, we call the client – the equivalent of a physical alarm.”

What’s hot on Infosecurity Magazine?