Skip to main content

Snyk Code now secures AI builds with support for LLM sources

blog-feature-ai-blue

25 de junho de 2024

0 minutos de leitura

As we enter the age of AI, we’ve seen the first wave of AI adoption in software development in the form of coding assistants. Now, we’re seeing the next phase of adoption take place, with organizations leveraging increasingly widely available LLMs to build AI-enabled software. Naturally, as the adoption of LLM platforms like OpenAI and Gemini grows, so does the security risk associated with using them. 

Given the newness of generative AI technology, the serious security implications of using data generated by LLMs are often underestimated. Leveraging such data can potentially lead to issues such as prompt injections, resulting in source code vulnerabilities

Snyk safeguards your AI-augmented applications

Given that Snyk aims to enable the safe and simple adoption of AI, we are proud to announce that Snyk Code now protects the use of LLM libraries in source code. We have extended our vulnerability-scanning capabilities to enable Snyk Code to track the data flows from code pulled from supported LLM libraries, including those from OpenAI, HuggingFace, Anthropic, and Google, to detect any security issues that may arise, and to alert users to the problems. 

What does this mean?

In other words, we have updated our source libraries across all supported languages to incorporate LLM sources. Now, when DeepCode AI, the AI engine powering Snyk Code, uses machine learning to detect sources, sanitizers, and sinks according to our state-of-the-art taint rules, it will identify and alert users to:

  • Data flowing from LLM sources/libraries that have sunk into a sensitive function or data store or are not sanitized.

  • Data traversing LLM sources/libraries that either reach a sensitive function or data store or are not sanitized.

Put simply, if a developer uses one of the supported LLM libraries in their code base, Snyk Code now performs a taint analysis, detects untrusted data — data falling into the above categories — and generates a taint issue for that specific source, then reports this to the user. For the sake of clarity, any data returned from an LLM library will now be treated as a source, regardless of whether the prompt is “hardcoded” or provided by a user. This means that you will have the assurance of being alerted to potential issues, even if you use Snyk Code on projects with LLM frameworks or libraries that we still do not officially support.

An example showing how Snyk Code detected and flagged a vulnerable LLM source that was used as a prompt source and that introduced a SQL injection.

An example showing how Snyk Code detected and flagged an XSS vulnerability pulled from OpenAI’s API as an LLM injection.

What next?

Software development with AI is fast-growing and constantly evolving, so our security analysts are continually researching this topic, and new findings will be periodically published in our blog. Extending Snyk Code’s coverage significantly demonstrates Snyk’s commitment to making AI safe and trustworthy. Now, alongside securing both AI-generated and human-created first-party code, Snyk Code also protects organizations from third-party LLM code issues at the source-code level so that developers can confidently build AI capabilities into their applications.

Already using Snyk Code?

Don’t just find security issues — enjoy one-click, in-IDE vulnerability autofixing with battle-tested DeepCode AI Fix, which is 20% more accurate than GPT-4. Simply switch on Snyk Code Fix Suggestions in the Snyk Preview settings and give it a go!