Half of AI Open Source Projects Reference Buggy Packages

Written by

Open source is playing a growing role across the AI technology stack, but most (52%) projects reference known vulnerable dependencies in their manifest files, according to Endor Labs.

The security vendor’s latest State of Dependency Management report claimed that just five months after its release, ChatGPT’s API is used in 900 npm and PyPI packages across “diverse problem domains,” with 70% of these brand new packages.

However, as for any open source projects, the security risks associated with vulnerable dependencies must be managed, Endor Labs warned.

“Seeing the newest generation of artificial intelligence APIs and platforms capture the public’s imagination the way they have is wonderful, but this report from Endor Labs offers vivid proof that security hasn’t kept pace,” said Michael Sampson, principal analyst at Osterman Research.

“The greater adoption of technologies that provide faster identification and automated remediation of potential weaknesses will make a huge difference in this critical field.”

Read more on malicious open source packages: Hundreds of Malicious Packages Found in npm Registry.

Unfortunately, organizations appear to be underestimating the risk not only of AI APIs in open source dependencies, but security sensitive APIs in general.

Over half (55%) of applications have calls to security sensitive APIs in their code base, but that rises to 95% when dependencies are included, claimed the report.

Endor Labs also warned that large language model (LLM) technology like ChatGPT is poor at scoring the malware potential of suspicious code snippets. It found that OpenAI GPT 3.5 had a precision rate of just 3.4%, while Vertex AI text-bison performed little better, at 7.9%.

“Both models produced a significant number of false positives, which would require manual review efforts and prevent automated notification to the respective package repository to trigger a package removal. That said, it does appear that models are improving,” the report noted.

“Those findings exemplify the difficulties of using LLMs for security-sensitive use cases. They can surely help manual reviewers, but even if assessment accuracy could be increased to 95% or even 99%, it would not be sufficient to enable autonomous decision making.”

Elsewhere, the report noted that developers may be wasting their time remediating vulnerabilities in code which isn’t even used in their applications.

It claimed that 71% of typical Java application code is from open source components, but that apps use only 12% of imported code.

“Vulnerabilities in unused code are rarely exploitable; organizations can eliminate or de-prioritize up to 60% of remediation work with reliable insight into which code is reachable throughout an application,” the report said.

What’s hot on Infosecurity Magazine?