Mob Mentalities: The World of Crowdsourced Software Development

Danny Bradbury examines security in crowdsourced software development
Danny Bradbury examines security in crowdsourced software development
Open source projects – which often use multiple developers from around the world contributing to a single code base – represent just one crowdsourcing model
Open source projects – which often use multiple developers from around the world contributing to a single code base – represent just one crowdsourcing model
Jeff Williams, OWASP
Jeff Williams, OWASP
Scott Matsumoto, Cigital
Scott Matsumoto, Cigital

December was a nasty month for OpenBSD. The open-source Unix derivative has come under fire from a former contractor, who said that he was paid to embed back door code for the FBI. Gregory Perry, formerly CTO at NETSEC, where he arranged donations and funding for OpenBSD’s cryptographic framework, claims that he and others inserted the back doors so that the FBI could monitor site-to-site encrypted VPN traffic.

Scott Lowe, one of the people named by Perry, refutes the allegations, and no one has moved to patch the operating system yet. “They’re now scouring the network stacks. No-one knows if it’s true”, says Jeff Williams, CEO of software testing firm Aspect Security, and chair of the OWASP Foundation [Open Web Application Security Project], which dedicates itself to helping make web applications more secure. But that’s the problem: It is difficult to know whether open-source software is compromised, and if so, how badly.

A Sea of Unknown Unknowns

“I try not to be a pessimistic kind of guy, but this is easy to do and totally devastating. So why wouldn’t they do it?”, asks Williams. “If I was a bad guy, I would do it. There’s no chance you’d get caught, and even if you did you could probably deny it. You could compromise the world’s biggest systems.”

Crowdsourcing has been a big part of the software development process since the GNU Manifesto was published in 1985. Open-source projects, which often use multiple developers contributing to a single code base, represent just one crowdsourcing model. Another more recent development is the use of collaborative tools and online markets to farm out the development of proprietary code to third-party contractors.

"We break up code into modules, so it isn’t like you can just take some open source code and work it into a competition"
Ira Heffan, TopCoder

Staff in small businesses, using sites like elance.com and guru.com, can find multiple developers, and commission them to develop parts of an application that they can then bolt together. It’s a cheap and fast way to get an application to market. It is, as Scott Matsumoto, principal consultant at software security firm Cigital puts it, the outsourcing equivalent of herding cats.

Who’s Accountable?

Both models come with potential problems. Software security firm Coverity, which publishes an annual report exploring the state of security in open-source software, points out that open-source projects lack accountability. Many developers from different organizations work on such projects. “Given the internal supply chain within open source itself, who is accountable to upholding these requirements and providing visibility to OEMs?”, the report asks. “And who is to blame if and when there is a problem?”

Google’s Android open-source mobile operating system was one of 291 open-source projects that Coverity scanned for security flaws. It found that there were slightly fewer than half the average bugs per 100 lines of code in Android, but that 25% of the flaws were considered high-risk, with the potential to cause security vulnerabilities, data loss, or quality problems, such as system crashes. “These are traditionally defect types that many of our customers fix and eliminate completely prior to shipping a product”, Coverity points out.

"You can run code through an obsfucator and change the variable names, or take a chunk of code and re-factor it"
Michael Westmacott, British Computer Society

How can organizations prevent security flaws – either unwitting or intentional – from showing up in any crowdsourced code they use? There are several techniques, but none of them should be used in isolation.

Static analysis (as used by Coverity) is one method. It employs automated tools to search source code for specific types of flaws. However, such tools are unlikely to find every type of vulnerability.

This is especially the case when flaws have been deliberately worked into the code by a malicious third party intent on hiding them, argues Matsumoto.

I’ll Show You Mine If You Show Me Yours

“The problem of detecting backdoors in the code is much thornier. One would like some form of automated scanning to do this task”, Matsumoto says. “The problem is that the scanner needs some form of pattern to search for, and no pattern exists or would be easily subverted.”

Peer review of code is another protection mechanism, where programers go through each others’ code manually to try and find flaws. In some programing methodologies, such as the eXtreme Programming approach outlined by software developer Kent Beck, two programers work together side by side coding the same function, so they can compare notes and spot errors in each others’ techniques.

"I try not to be a pessimistic kind of guy. But this is easy to do and totally devastating. So why wouldn’t they do it?"
Jeff Williams, OWASP

Yoav Aner is software architect for Testuff, a firm providing an online software test management service. He also developed a methodology for finding errors in crowdsourced applications, which was published by the UK’s Royal Holloway University. Aner says that the nature of open-source programing lends itself to peer review. Programers on an open-source project generally need to look at each others’ code as part of the coding process for their own functions, he says, meaning that code will often pass across many screens before being officially validated.

There is a caveat, however, which comes back to the lack of accountability in fragmented open-source teams. “If you have good programers, they should be security trained and aware, and have the ability to find security bugs”, Aner points out. “Therefore, even if you ask 5-6 people to look at the same code, if they’re not qualified for security, then they could miss those bugs.”

Paying for Bugs

Some companies adopt a structured approach to peer review, even putting money behind it to attract people who know how to find security bugs. For example, Google has offered bounties totaling $10 000 (£6500) to programers who can find high-risk flaws in its Chrome browser, effectively crowdsourcing the peer-review process.

Aner’s paper highlights threat modeling as a key technique in evaluating the security defects in crowdsourced software. This process, which should ideally happen during software design, concentrates on real-world risks that may be identified by attackers to create exploits for code.

Aner’s threat modeling approach draws heavily on the threat modeling process used as part of Microsoft’s Security Development Lifecycle, initially formulated to try and weed out security problems during the software development process. It breaks down into four discrete steps: application analysis; threat enumeration; threat rating; and mitigation options.

"The problem of detecting backdoors in the code is much thornier. One would like some form of automated scanning to do this task"
Scott Matsumoto, Cigital

Analyzing the application includes understanding any dependencies on third-party software libraries, and identifying usage scenarios for the software (including unauthorized scenarios).

Enumeration identifies threats according to different categories. These are: Spoofing (disguising identity); Tampering (altering data); Repudiation (covering one’s tracks after an attack); Information disclosure (breaching information confidentiality); Denial of service (stopping the software from fulfilling its function); or Escalation of privilege. Taken together, these form the acronym ‘STRIDE’.

Prioritizing the threats helps the security team to then prioritize the measures they’re going to take in mitigating the threats.

Identifying Threats

Ultimately, threat modeling is designed to help identify real-world threats to a particular application and to seal those cracks as effectively as possible. Aner enhances the Microsoft model, however, by also applying these threat modeling techniques to the software development process itself, helping to lock down areas where the workflow between distributed, loosely connected teams of developers could be compromised by an attacker, for example.

Ideally, threat modeling should be used in conjunction with other software protection techniques, Aner argues. “It’s really defense in depth, layering as many approaches together as you can and having an efficient delivery cycle”, he says, explaining that developer education and awareness play an important part. Security reviewers should also apply proper code testing processes, including both automatic and manual ones. They should incorporate effective regression testing into their procedures, to check the effectiveness of existing code in addition to new functionality.

"If you have good programers, they should be security trained and aware, and have the ability to find security bugs"
Yoav Aner, Testuff

“If you follow this process through, you can also augment it with penetration testing”, Aner asserts. This enables professional ethical hackers to have a real-world crack at compromising the software, which provides yet another layer of protection.

Collabnet, a company that sells tools and services to manage distributed software development teams using a central server, also believes that process is king when it comes to secure crowdsourced software development. Guy Marion, VP and general manager of the firm’s Codesion Cloud Services business unit says that the company’s centralized development platform lets project managers check that distant developers have adhered to coding standards defined by an organization. “You can have code checks incorporated into the workflow of the organization, and incorporate a hierarchical quality assurance sign-off”, he says.

Legalities

This process may not help when it comes to evaluating the provenance of code, however. The line between open-source and commercial software has blurred during the past few years, with many commercial firms relying on open-source code as part of their own products. How do you ensure that the code people have contributed to a project that is theirs to use, and that inclusion in the final software doesn’t violate any license agreements?

“It’s going to be nigh-on impossible”, says Michael Westmacott, a committee member of the British Computer Society’s Information Security Specialist Group. “You can run code through an obsfucator and change the variable names, or take a chunk of code and re-factor it. It will be difficult to ever tell whether your code has been stolen from somewhere.”

"You can have code checks incorporated into the workflow of the organization, and incorporate a hierarchical quality assurance sign-off"
Guy Marion, Collabnet

TopCoder, a software development service that mixes some elements of both crowdsourcing and traditional outsourcing, concentrates on process as a means of weeding out some of the worst of these risks, while still capitalizing on the power of the crowd. The company takes software development projects from customers, and then runs competitions among third-party developers to create the best code.

“We break up code into modules, so it isn’t like you can just take some open-source code and work it into a competition”, says Ira Heffan, general counsel for TopCoder, explaining how it avoids the misappropriation of existing code among its developer base. “We ask for something specific that gets designed along the way. So if there’s a bunch of extra code, it would fail.”

Given the need for cheap code, written or assembled quickly for a fast time-to-market, it is unlikely that crowdsourced application development will go away anytime soon. Hopefully, however, as companies continue to recognize the importance of software security, processes will improve and future reports from the likes of Coverity will show a marked improvement in code governance. As it is, the state of open-source security has stayed the same for the past few years, Coverity concludes – which makes it a digital time-bomb just waiting to go off.

What’s hot on Infosecurity Magazine?